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Description 

Ba ckground of the Invention 

This invention relates to recombinant plant nucleic acids and polypeptides. 

Improved means to manipulate plant gene expression is desired for a variety of industrial, agricultural, and com- 
mercial food uses. To produce new plant varieties, it is necessary to change the genetic makeup of the crop or plant in 
question Desirable genes have to be incorporated into the crop or plant, and undesirable genes have to be eliminated 
or replaced. In other words, one needs to genetically engineer the plant to meet the demands of agriculture. Accordingly, 
genetic engineering of crop plants necessitates methods of identifying potentially valuable genes and transferring these 
to the crop that one desires to improve. 

Summary of the Invention 

We have identified and describe herein a novel plant transcriptional activator from the crucifer, Arabidopsis thaliana. 
In addition to its role as a transcriptional activator, we have also determined that this protein plays a role in plant defense 
mechanisms by interacting with proteins. e.g.. 3-O-methyltransferase and ascorbate peroxidase, involved in protecting 
plants from pathogens. We named this protein AFT1 (Arabidopsis Fourteen-Three-three 1) because it shows sequence 
homology to the widespread 14-3-3 protein family. 

The AFT1 protein provides a means to enhance, control, modify or otherwise alter plant gene expression, e.g. as 
a transcription activator or as a chimeric transcriptional activator, or even to modulate events during plant cell-signalling 
processes. e.g.. signal transduction events involved in plant defense responses to pathogens such as fungi, nematodes, 
insects, bacteria, and viruses. Of special interest are the nucleic acid sequences corresponding to not only other AFT1 
proteins found in the plant kingdom, but also sequences corresponding to proteins which interact with AFT1 during plant 
signal transduction events, e.g., those pathways which operate during a plant's response to a pathogen, for applications 
in genetic engineering, especially as related to agricultural biotechnology. 

Accordingly in general, the invention features recombinant AFT1 polypeptides, preferably, including an amino acid 
sequence substantially identical to the amino acid sequence shown in Fig. 1 (SEQ ID NO :2). The invention also features 
a recombinant polypeptide which is a fragment or analog of an AFT1 polypeptide that includes a domain capable of 
activating transcription, e.g.. AFT1 (34-248) or AFT1 (122-248). Transcription activation may be assayed, for example, 
according to the methods described herein. 

In various preferred embodiments, the polypeptide is derived from a plant (e.g.. a monocot or dicot). and preferably 
from a crucifer such as Arabidopsis. 

In a second aspect, the invention features a chimeric AFT1 transcriptional activation protein including an AFT1 
polypeptide fused to a DNA-binding polypeptide. In preferred embodiments, the DNA-binding polypeptide includes, with- 
out limitation. Gal4 or LexA. . 

In a third aspect, the invention features a transgenic plant containing a transgene comprising an AFT1 protein oper- 
ably linked to a constitutive (e.g.. the 35S CaMV promoter) or regulated or inducible promoter (e.g.. rbcS promoter). In 
other related aspects, the invention also features a transgenic plant containing a transgene containing a chimeric AFT1 
transcriptional activator protein. In related aspects, the invention features a seed and a cell from a transgenic plant 
containing the AFT1 protein/fragment or analog, or a chimeric AFT1 transcriptional activator protein. 

In a fourth aspect, the invention features a transgenic plant expressing a polypeptide of interest which involves: (a) 
a nucleic acid sequence encoding a chimeric AFT1 transcriptional activator protein; and (b) a nucleic acid sequence 
encoding a polypeptide of interest in an expressible genetic construction, wherein the binding of the chimeric protein 
regulates the expression of the polypeptide of interest. In preferred embodiments the polypeptide of interest is. without 
limitation, a storage protein, e.g.. napin, legumin, or phaseolin. or any other protein of agricultural significance. 

In a fifth aspect, the invention features substantially pure DN A (for example, genomic DN A, cDNA, or synthetic DNA) 
encoding an AFT1 protein. Accordingly, the invention features a nucleotide sequence substantially identical to the nucle- 
otide sequence shown in Fig. 1 (SEQ ID NO: 1). In related aspects, the invention also features substantially pure DNA 
encoding a recombinant polypeptide including an amino acid sequence substantially identical to the amino acid 
sequence of AFT1 polypeptide shown in Fig. 1 (SEQ ID NO: 2). Such DNA may, if desired, be operably linked to a 
constitutive or regulated or inducible promoter as described herein. In preferred embodiments, the DNA sequence is 
from a crucifer (e.g.. Arabidopsis). In related aspects, the invention also features a vector, a cell (e.g.. a plant cell), and 
a transgenic plant or seed thereof which includes such substantially pure AFT1 DNA. In various preferred embodiments, 
the cell is a prokaryotic cell, for example. E. coli or Agrobacterium. or more preferably, a eukaryotic ceil, for example, a 
transformed plant cell derived from a cell of a transgenic plant. 

In a sixth aspect, the invention features a recombinant polypeptide which is a fragmert or analog of an Am polypep- 
tide (SEQ ID NO: 2) including a domain capable of interacting with a plant defense related protein. Preferably, the polypep- 
tide is AFT1 (33-194). In related aspects, the invention also features substantially pure DNA encoding an AFT1 
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polypeptide fragment or analog, preferably the DNA is substantially identical to the DNA sequence shown in Fig. 1 (SEQ 
ID NO: 1). In other aspects, the DNA is operably linked to a constitutive or regulated or inducible promoter. 

By "crucifer" is meant any plant that is classified within the Cruciferae family as commonly described in, e.g., Gray's 
Manual of Botany American Book Company, N.Y., 1950; Hortus Third: A Concise Dictionary of Plants Cultivated in the 
s U.S. and Canada. Macmillan, 1976; or Simmons, N.W., Evolution of Crop Plants, 1986. The Cruciferae include many 
. agricultural crops, including, broccoli, cabbage, brussel sprouts, rapeseed, kale, Chinese kale, cauliflower, horseradish, 
and Arabidopsis. 

By "AFT1 " is meant a crucifer polypeptide capable of effecting transcriptional activation or interacting with a polypep- 
tide involved with a plant defense polypeptide. Such an AFT1 polypeptide has the sequence shown in Fig. 1 (SEQ ID 
w NO.:1). 

By "protein" and "polypeptide" is meant any chain of amino adds, regardless of length or post-translational modifi- 
cation (e.g.. glycosylate or phosphorylation). 

By "substantially identical" is meant a polypeptide or nucleic acid exhibiting at least 90%, preferably 93%, more 
preferably 95%, and most preferably 97% homology to a reference amino acid or nucleic acid sequence. 

is For polypeptides, the length of comparison sequences will generally be at least 16 amino acids, preferably at least 20 
amino acids, more preferably at least 25 amino acids, and most preferably 35 amino acids. For nucleic acids, the length 
of comparison sequences will generally be at least 50 nucleotides, preferably at least 60 nucleotides, more preferably 
at least 75 nucleotides, and most preferably 1 1 0 nucleotides. 

Homology is typically measured using sequence analysis software (e.g., Sequence Analysis Software Package of 

20 the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wi 
53705). Such software matches similar sequences by assigning degrees of homology to various substitutions, deletions, 
substitutions, and other modifications. Conservative substitutions typically include substitutions within the following 
groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threo- 
nine; lysine, arginine; and phenylalanine, tyrosine. 

25 By a "substantially pure polypeptide" is meant an AFT1 protein which has been separated from components which 
naturally accompany it. Typically, the polypeptide is substantially pure when it is at least 60%, by weight, free from the 
proteins and naturally-occurring organic molecules with which it is naturally associated. Preferably, the preparation is at 
least 75%, more preferably at least 90%, and most preferably at least 99%, by weight, AFT1 polypeptide. A substantially 
pure AFT1 polypeptide may be obtained, for example, by extraction from a natural source (e.g. , a plant cell) ; by expression 

30 of a recombinant nucleic acid encoding an AFT1 polypeptide; or by chemically synthesizing the protein. Purity can be 
measured by any appropriate method, e.g.. those described in column chromatography, polyacrylamide gel electro- 
phoresis, or by HPLC analysis. 

A protein is substantially free of naturally associated components when it is separated from those contaminants 
which accompany it in its natural state. Thus, a protein which is chemically synthesized or produced in a cellular system 

35 different from the cell from which it naturally originates will be substantially free from its naturally associated components. 
Accordingly, substantially pure polypeptides include those derived from eukaryotic organisms but synthesized in E. cdi 
or other prokaryotes. 

By "substantially pure DNA" is meant DNA that is 
free of the genes which, in the naturally-occurring genome of the organism from which the DNA of the invention is 

40 derived, flank the gene. The term therefore includes, for example, a recombinant DNA which is incorporated into a vector; 
into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote; or which exists 
as a separate molecule (e.g., a cDNA or a genomic or cDNA fragment produced by PCR or restriction endonuclease 
digestion) independent of other sequences. It also includes a recombinant DNA which is part of a hybrid gene encoding 
additional polypeptide sequence. 

45 By "transformed cell" is meant a cell into which (or into an ancestor of which) has been introduced, by means of 
recombinant DNA techniques, a DNA molecule encoding (as used herein) an AFT1 protein or an AFT1 chimeric tran- 
scriptional activator. 

By "promoter" is meant a DNA sequence sufficient to direct transcription; such elements may be located in the 5' 
or 3* regions of the gene. By "constitutive" promoter is meant a promoter capable of mediating gene expression without 
so regulation, i.e., the promoter is always transcriptionally active. By "regulated or inducible" promoter is meant a promoter 
capable of mediating gene expression in response to a variety of developmental (e.g., cell-specific, tissue-specific, and 
organ-specific promoters), environmental, and hormonal cues including, but not limited to, promoters such as the rbcS, 
wunl, chlorophyll a/b, or E 2 promoters described herein. 

By "operably linked" is meant that a gene and a regulatory sequenced) (e.g., a promoter) are connected in such a 
55 way as to permit gene expression when the appropriate molecules (e.g., transcriptional activator proteins) are bound to 
the regulatory sequence(s). 

By "plantcelir is meant any^self-propagating cell bounded by.a semi-permeable membrane and containing a plastidr 
Such a cell also requires a cell wall if further propagation is desired. Plant cell, as used herein includes, without limitation, 



3 



EP 0 693 554 A1 

algae, cyanobacteria. seeds suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, 
gametophytes, sporophytes. pollen, and microspores. 

By "transgene" is meant any piece of DNA which is inserted by artifice into a cell, and becomes part of the genome 
of the organism which develops from that cell. Such a transgene may include a gene which is partly or entirely heterol- 
5 ogous (i.e., foreign) to the transgenic organism, or may represent a gene homologous to an endogenous gene of the 
organism. 

By "transgenic" is meant any cell which includes a DNA sequence which is inserted by artifice into a cell and becomes 
part of the genome of the organism which develops from that cell. As used herein, the transgenic organisms are generally 
transgenic plants and the DNA (transgene) is inserted by artifice into either the nuclear or plastidic genome. 
10 By "plant defense related protein" is meant any protein which is involved in the protection or resistance to plant pests 
(e.g., bacteria, insects, nematodes, fungi, and viruses). Such proteins include, without limitation. 3-O-methyltransferases, 
ascorbate peroxidases, chalcone synthases, hydroxyproline rich glycoproteins, glucanases, chitanases, and proteinase 
inhibitors. 

Other features and advantages of the invention will be apparent from the following description of .the preferred 
is embodiments thereof, and from the claims. 

Detailed Description ^ 

The drawings will first be briefly described. 

20 

Drawings 

Fig. 1 is the nucleic acid sequence (SEQ ID NO:1) and deduced amino acid sequence of Arabidopsis AFT1 (SEQ 
ID NO:2). 

25 Rg. 2 shows the LexA-dependent activation of LEU2 expression by AFT1 ; activation was monitored by the growth 
of yeast on a leucine-minus plate. The AFT1 clone in vector pJG4-5 which directs the production of AFT1 /B42 fusion 
protein was introduced into the yeast strain EGY48 where different plasmids had already been introduced. The plasmids 
which either direct production of different LexA fusion proteins or no LexA protein are pEG202 (LexA alone, a), pHM1 - 
1 (LexA/Biocoid. b). pHM12 (LexA/Cdc2, c), pHM7-3 (LexA/Ftz homeo-domain), d), pAKR1-261 (LexA/AKR1-261). e). 

so pAKR249-434 (LexA/AKR249-434, f), pAKR1 14-434 (LexA/AKR1 14-434, g). and pHM (no LexA, h). 

Figs. 3A and 3B are schematic representations showing transcription activation by AFT1. The effects of various 
fusion proteins were monitored by the growth of yeast in the absence of leucine and quantitated by measuring the activity 
of the p-galactosidase. Panel (A) shows transcription activation by AFT1 and its derivatives fused to the activation domain 
B42 upon introduction into the yeast strain EGY48. This strain also contains the plasmid pEG202 which directs consti- 

35 tutive production of LexA protein and plasmid pSH 18-34 which contains the reporter gene LexAop-LacZ. Panel (B) 
shows transcription activation by AFT1 and its derivatives fused to the LexA protein in the plasmid pEG202 upon intro- 
duction into the yeast strain EGY48 containing the plasmid pSH18-34 only. 

Fig. 4 shows a genomic Southern blot analysis. The blot was probed with a labeled AFT1 cDNA clone. The lanes 
labeled C contain Columbia DNA and L, Landsberg DNA. The restriction enzymes used are indicated above the lanes. 

40 The sizes of X-Hind III digested DNA fragments used as length markers are shown on the left. 

Figs. 5A, 5B and 5C show a RNA blot analysis of AFT1 expression. Panel (A) shows the developmental expression 
of AFT1 . RNAs were extracted from greenhouse-grown plants; Panel (B) shows the organ-specific expression of AFT1 . 
RNAs of leaf, root and stem were extracted from plate-grown plants, and RNAs of flower and silique were extracted 
from greenhouse-grown plants. Panel (C) shows the effect of light on the expression of Lhca2 and AFT1. RNAs were 

45 extracted from greenhouse-grown plants. 

Fig. 6 shows the DNA sequence (SEQ ID NO: 17) of an isolated cDNA found to be an AFT1 interacting protein 
coding for ascorbate peroxidase. 

Fig. 7 shows the partial amino acid sequence (SEQ ID NO: 18) of ascorbate peroxidase deduced from the isolated 
cDNA(SEQIDNO:17). 

so Fig. 8 shows the DNA sequence (SEQ ID NO: 19) of an isolated cDNA found to be an AFT1 interacting protein 
coding for 3-O-methyltransf erase. 

Fig. 9 shows the partial amino acid sequence (SEQ ID NO: 20) of 3-O-methyrtransferase deduced from the isolated 
cDNA(SEQIDNO:19). 

Rg. 10 shows the DNA sequence (SEQ ID NO: 21) of an isolated cDNA found to be an AFT1 interacting protein 
55 coding for an Arabidopsis ankryin repeating protein AKR 2 . 

Rg. 1 1 shows the partial amino acid sequence (SEQ ID NO: 22) of an Arabidopsis ankryin repeating protein AKR 2 
deduced from the isolated cDNA(SEQJD NO: 21). . ._ - - 

- Rg. 12 shows the DNA sequence (SEQ ID NO: 23) of an isolated cDNA found to be an AFT1 interacting protein 
coding for proteasome. 
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Fig. 13 shows the partial amino acid sequence (SEQ ID NO: 24) of proteasome deduced from the isolated cDNA 
(SEQ ID NO: 23). 

Fig. 14 shows the DNA sequence (SEQ ID NO: 25) of an isolated cDNA found to be an AFT1 interacting protein. 
Fig. 1 5 shows the partial amino acid sequence (SEQ ID NO: 26) deduced from the isolated cDNA (SEQ ID NO: 25). 

Polvoetides A ccording to the Invention 

Polypeptides according to the invention include the entire Arabidopsis AFT1 protein (as described in Fig. 1; SEQ 
ID No: 2). These polypeptides are used, e.g., to manipulate plant gene expression at the transcriptional level (as dis- 
cussed infra) or to manipulate the plant signal transduction pathway by providing plants with the potential of resisting 
pathogens such as fungi, insects, nematodes, bacteria, and viruses. Polypeptides of the invention also include any 
analog or fragment of the Arabidopsis AFT1 protein capable of activating transcription in a host plant. The efficacy of 
an AFT1 analog or fragment to activate transcription is dependent upon its ability to interact with.the transcription com- 
plex; such an interaction may be readily assayed using any number of standard in vivo methods, e.g., the interaction 
trap mechanism described infra. Similarly, the polypeptides of the invention include chimeric AFT1 transcriptional acti- 
vator proteins capable of selectively activating transcription of a specified gene. 

Specific AFT1 analogs of interest include full-length or partial (described^infra) AFT1 proteins, including amino acid 
sequences which differ only by conservative amino acid substitutions, for example, substitutions of one amino add for 
another of the same class (e.g., valine for glycine, arginine for lysine, etc.) or by one or more non-conservative amino 
acid substitutions, deletions, or insertions at positions of the amino acid sequence which will not destroy AFT1 *s ability 
to activate transcription (e.g., as assayed infra). 

Specific AFT1 fragments of interest include any portions of the AFT1 protein which are capable of interaction with 
an AFT1 ligand, e.g.. a member of the transcriptional complex or a protein involved in plant defense mechanisms, such 
as 3-O-methyltransferase. and ascorbate peroxidase. Identification of such ligands may be readily assayed using any 
number of standard in vivo methods, e.g.. the interaction trap mechanism described infra. 

There now follows a description of the cloning and characterization of an Arabidopsis AFT-encoding cDNA useful 
in the instant invention, and a characterization of its ability to activate transcription, and its protein interacting properties. 
This example is provided for the purpose of illustrating the invention and should not be construed as limiting. 

Isolation of an Arabidopsis Gene Encoding an AFT protein 

The Arabidopsis AFT1 gene was isolated as follows. 
A yeast interaction trap system (Zervos et al, Cell 72:223-232, 1993; Gyuris et al.. Cell 75:791-803, 1993) was modified 
for the isolation of an Arabidopsis AFT protein. The yeast strain EGY48 (MATa trp1 ura3 his3 LEU2:plexAop6-LEU2) 
containing a plasmid pJK103 (Zervos et al., supra) that directs expression of a Gah-lacZ gene from two high affinity 
ColE1 LexA operators, was used in the interaction trap experiment. A "bait* (LexA/AKR1 -261 , residues 1 -261 of AKRP 
(Arabidopsis anKyrin repeat protein) fused to DNA binding protein LexA) was introduced into the strain and then an 
Arabidopsis cDNA expression library was introduced (see. e.g., Zhang et al.. Plant Cell 4:1575-1588. 1992). Selection 
was first carried out on leucine minus plates, and Leu* colonies were analyzed on X-gal plates. The clones which activated 
transcription of reporter genes in the presence of. but not in the absence of. the LexA protein or its fusion derivatives 
were isolated. 

The o1igo(dT)-primed activation-tagged cDNA expression library in vector pJG4-5 (Gyuris et al., supra) was made 
from mRNA of four week-old Arabidopsis leaves. The yeast strain EGY48, the vector plasmids pJG4-5 and pEG202, 
and the plasmids pHM1-1, pHM7-3. pHM12, pHM*. and pSH18-34 were provided by Dr. Roger Brent. The LexA/AKR 
fusion proteins were constructed as follows. The oligonucleotides used to amplify desired AKR fragments which were 
later subcloned into pEG202 are shown below. 

OAB-9: GCGGAATTCATGAGGCCCATTAAAATT (SEQ ID NO: 3) 

OAB-1 0: GTAGGATCCGGTCGGATTTCTTGTCGC (SEQ ID NO: 4) 

OAB-11: CGCGAATTCAATAGCGACAAGTACGAT (SEQ ID NO: 5) 

OAB-1 2: GTAGGATCCGTCTCTCTTCCAAGGTAGA (SEQ ID NO: 6) 

OAB-20: GATCCTAGAATTCAAGAAGAATCGGCGTGGC (SEQ ID NO: 7) 
The combination of oligonucleotides used for fusion proteins are: OAB-9 and OAB-1 0 (LexA/AKR 1 -261 ); OAB-1 1 and 
OAB-12 (LexA/AKR249-434); OAB-20 and OAB-12 (LexA/AKR1 14-434). Normally, with this technique, a library that 
expresses cDNA-encoded proteins fused to a transcription activator domain (B42) is introduced into a special yeast 
strain. This strain also contains a plasmid which directs constitutive production of a transcriptionally inert LexA fusion 
protein which is called the "barf* (LexA fused to the protein of interest) and two reporter genes. The transcription of these 
two reporter genes can be stimulatedjfthe cDNA^ntx>q^prote^ 

allows growth ih the absence of leucine and the other reporter gene LacZ codes for p-galactosidase. 
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We found that many proteins encoded by Arabidopsis cDNAs activated transcription with LexA protein alone, or 
with many different baits, although all of these proteins required a LexA binding domain. This results in the isolation of 
cDNA clones which are not true interaction partners of the "bait" and requires further analysis to separate these "false 
positive" clones from the desired partner clones. Examples of activation by AFT1 which is dependent upon the presence 
5 of LexA are shown in Fig. 2. To further understand such activation, we characterized 81 cDNA clones which encoded 
proteins capable of activating the expression of the reporter genes. Among the cDNAs sequenced. 36 clones were 
derived from the same gene which encodes a 14-3-3-like protein. This gene was named AFT1 (Arabidopsis Fourteen- 
Three-three 1 ), and the protein AFT1 encodes is designated as AFT1 . AFT1 contains 248 amino acids with a molecular 
weight of about 28 kD. 

10 

Transcription Activation bv AFT1 

A series of experiments were performed to determine which AFT1 sequences were required for transcriptional 
activation in the yeast interaction trap system. Accordingly, a series of deletion constructs were made, and analyzed 
is according to methods known in the art as follows. To test activation by B42/AFT1 fusion proteins, a series of AFT1 
derivatives fused to B42 in the plasmid pJG4-5 were constructed. These plasmids were introduced into the strain EGY48 
containing the plasmid pEG202 which directs the constitutive production of LexA protein and the plasmid pSH18-34 
which contains the LexAop-LacZ reporter gene. To test activation by LexA/AFT1 fusion proteins, a series of AFT1 deriv- 
atives were fused to LexA in the plasmid pEG202 were constructed and were introduced into the strain EGY48 containing 
20 the plasmid pSH18-34. Transcription activation by AFT1 and its derivatives was measured by the growth of yeast on 
leucine minus plates and the activity of p-galactosidase. The assay for p-galactosidase was conducted as described by 
Zervos et al., supra. The oligonucleotides used to amplify desired AFT1 fragments which were later subcloned into 
pJG4-5 and pEG202 are shown below. 

JW-5: CTGACTGAATTCATGGCGGCGACATTAGG (SEQ ID NO: 8) 
25 JW-6: GACTGAGTCGACCCTTCATCTAGATCCTC (SEQ ID NO: 9) 

JW-7: GACTGACTCGAGCCTTCATCTAGATCCTCA (SEQ ID NO: 10) 
JW-8: CTGACTGAATTCGAGTCTAAGGTCTTTAC (SEQ ID NO: 1 1) 
JW-9: GACTGACTCGAGACTCGCTCCAGCAGATGG (SEQ ID NO: 1 2) 
JW-10: GACTGACTCGAGTGAAGAATTGAGAATCTC (SEQ ID NO: 13) 
so JW-1 1 : GACTGAGTCGACACTCGCTCCAGCAGATGG (SEQ ID NO: 14) 

JW-12: GACTGAGTCGACTGAAGAATTGAGAATCTC (SEQ ID NO: 15) 
JW-13: CTGACTGAATTCGTTACAGGCGCTACTCCAG (SEQ ID NO: 16) 
The combinations of oligonucleotides used for fusion proteins were: JW-5 and JW-6 (LexA/1-248); JW-5 and JW- 
12 (LexA/1-194); JW-5 and JW-1 1 (LexA/1-121); JW-13 and JW-6 (LexA/34-248); JW-8 and JW-6 (LexA/1 22-248); JW- 
35 5 and JW-7 (B42/1-248); JW-5 and JW-9 (B42/1-121); JW-13 and JW-7 (B42/34-248); JW-8 and JW-7 (B42/1 22-248); 
JW-13 and JW-10 (B42/34-194). 

Results from such experiments revealed that deletion of the C-terminal half of AFT1 (B42/1-121) completely abol- 
ished AFTVs ability to activate, whereas deletion of either 33 or 121 residues from the N-terminus (B42/34-248 and 
B42/1 22-248) increased activation (Fig. 3 A). The reason for the increased activation is not known, but might be due to 
40 the tertiary structures of these two fusion proteins (B42/34-248 and B42/1 22-248) which could result in stronger inter- 
actions with the transcriptional machinery. Nevertheless, it is the C-terminal half that is responsible for the observed 
activation when AFT1 is fused to B42, e.g., AFT1 residues 34-248 (SEQ ID NO: 2) and 122-248 (SEQ ID NO: 2). However, 
since B42 is an activator domain, the observed transcription activation may be due to the direct interaction of AFT1 with 
LexA, thereby bringing B42 into the proximity of the reporter gene promoter. An alternate possibility is suggested by the 
45 acidic nature of AFT1 (pi = 4.6). namely, AFT1 itself might be a transcription activator, since it shares this acidic feature 
with many transcription activators. 

AFT1 was also fused directly to LexA to test if AFT1 can activate transcription. The results shown in Fig. 3B dem- 
onstrate that AFT1 does activate transcription. To determine which portion of AFT1 was important for activation, 54 
amino acids were deleted from the AFT C-terminus (LexA/1 -1 94). This deletion caused AFT1 to lose its ability to activate 
so completely; whereas deletion of 33 amino acids from the N-terminus, (LexA/34-248) decreased activation by about 75%. 
As shown in Panel B of Fig. 3, when the N-terminal half of AFT1 (LexA/1 22-248) was deleted, activation dropped to 
basal levels. Thus, even though the C-terminal half is critical for activation and is more acidic than the N-terminal half, 
the N-terminal half also plays a role in activation. 

55 AFT1 Copy Number 

The copy number of the AFT1 gene was determinedly genomic DNA (Southern) blot analysis; Genomic DNA was 
prepared according to the method of Dellaporta et al. (Plant Mol. Biol. Rep. 4:19-21, 1983), digested with restriction 
enzymes, electrophoresed (5jig per lane), blotted t a Biotrans™ Nylon membrane, and hybridized with labeled ATF1 
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cDNA clone. Hybridizations were carried out according to the method of Church and Gilbert (Proc. Natl. Acad. Sci. USA 
81 :1 991 -1995, 1984) using probes labeled by random priming. The washing conditions were as follows: two times (10 
minutes each) in 0.5% BSA, 1mM EDTA, 40mM NaHP04 (pH 7.2), and 5.0% SDS at 63°C; then four times (5 minutes 
each) in 1 mM EDTA, 40mM NaHP04 (pH 7.2), and 1% SDS at 63°C. The condition for deprobing filters was as follows: 
5 two times (15 minutes each) in 2mM Tris (pH 8.2), 2mM EDTA (pH 8.0), and 0.1% SDS at 70°C for DNA Wots and at 
80°C for RNA blots. 

As shown in Fig. 4, digestion of two ecotypes (Columbia and Landsberg) of Arabidopsis DNA with the enzymes, 
Bgl II and Hind III. gave rise to two bands after the DNA Wot was probed with a labelled AFT1 cDNA sequence. These 
data indicate that only one copy of AFT1 was present in both ecotypes of Arabidopsis, since there was one restriction 
w site for Bgl II and one site for Hind 111 within the AFT1 cDNA, respectively. 

Developmental Expression Pattern of the AFT1 Gene In Arabidopsis 

The developmental and organ-specific expression of AFT1. as well as the light regulation of AFT1 expression, were 

is studied by RNA (Northern blot) analysis. Total RNA was isolated according to the method of Logemann et a!. (Anal. 
Biochem. 163:16-20, 1987), separated by electrophoresis (15 \ig per lane), blotted to a Biotrans™ Nylon membrane, 
and hybridized to the labeled AFT1 cDNA clone and the Arabidopsis Lhca2 cDNA clone. The conditions for hybridization 
and washing were the same as described in genomic Southern analysis supra. RNAs were extracted from AraWdopsis 
grown either in a greenhouse (16 hr light/8 hr dark at 25 ± 5°C) or on agarose plates in a tissue culture room (16 hr 

20 light/8 hr dark at 20 ± 2°C). Greenhouse-grown plants were used for developmental expression analyses. Leaves were 
harvested weekly for RNA preparation. Greenhouse-grown plants were also used for light induction experiments. At four 
weeks, plants were moved to a dark chamber for three days, then shifted back to light. Leaves were then harvested 
every two hours. Tissue culture-grown plants were used for organ-specific expression analyses. Leaf, root, and stem 
mRNAs were isolated from plants grown for 35 days on agarose plate in MS media supplemented with 1 % sucrose, and 

25 the flower and silique mRNAs were isolated from plants grown for 35 days in the greenhouse. The MS was purchased 
from Sigma (Cat# M-0153). As shown in Fig. 5, Panel A and TaWe I, when total RNAs isolated from leaves of one to five 
week-old plants were hybridized to a labelled AFT1 cDNA, the steady-state mRNA level of AFT1 did not change signif- 
icantly over a f ive week period. 

When RNAs isolated from different organs were analyzed, the steady-state mRNA level in silique was found to be 

30 about one fifth of that in flower, whereas the mRNA levels in leaves, roots, and stems were about the same (Fig. 5, Panel 
B; Table I). It should be noted that the mRNA levels from flowers and siliques are not directly comparable to those from 
leaves, roots, and stems (Fig. 5, Panel B), because they were from materials grown under different conditions (as 
described supra). However, the steady-state mRNA levels of flower and silique can be compared to that of five-week- 
old leaves shown in Fig. 5, Panel A. Ttie quantitative data indicate that the AFT1 mRNA level in leaves is about two 

35 times higher than that in flowers and nine times higher than that in siliques (Table I. infra). The growth conditions can 
affect the steady-state mRNA level since greenhouse-grown plants contained three times more AFT1 mRNA than plate- 
grown plants (Figs. 5, Panels A and B; TaWe I, infra). These data indicate that although AFT1 expression is probaWy 
required throughout much of the Arabidopsis life cycle, its steady-state mRNA level is still regulated organ-specif ically. 
Furthermore, dark-adapted plants contain at least two times more steady-state mRNA than plants grown in light (Fig. 

40 5, Panel C, Table I. infra), suggesting that light plays a role in the down-regulation of AFT1 expression. 

The relative intensities of AFT1 mRNA derived from the data in Figs. 5A-5C are shown below in TaWe I. The relative 
intensity data were collected from p-scanning of RNA gel Wots by a Blot Analyzer, and normalized using the intensity of 
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We have shown that the AFT1 gene of Arabidopsis encodes a novel protein which can activate transcription in yeast. 
Accordingly, we conclude that AFT1 functions as a transcriptional activator. 
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Chimeric AFT1 Proteins As Targeted Transcriptional Activators 

Since plant gen expression varies in accordance with developmental stages of different cell types and in response 
to different environmental factors and hormonal cues, the proteins (including the gene regulatory sequences) of ttie 
present invention are most useful for applications aimed at improving or engineering plant varieties of agricultural or 

capable of selectively activating transcription of a specified gene. ag.. a crucifer storage protein such as napin. Targeted 
transcription of a gene is acquired by imbuing the AFT1 transcriptional activator with the ab.hty to selectively activate a 
specific gene by fusing it to a DNA-binding domain which is capable of binding to the 5" upstieam regulatory region 
eq in the vicinity of the transcription start site. Such chimeric proteins contain two parts: the AFT1 transcnptional 
aS'vationregion(described^ 

regionof interest For example, a chimeric AFT1 transcriptional activator protein may be produced by f using a Ga^4 DNA 
Eg region (see. e.g.. Ma et al. Nature. 334*31-633. 1988; Ma et al. Cel. 48: ^-^^S!^S 
activating portion of AFT1 according to methods known in the art (e.g.. see Sadowskietal.. Nature 335.563-564. 988). 

Importantly, the gene of interest, e.g.. a napin storage protein gene, placed under the transcriptional control of an 
AFT1 chimeric activator must include the appropriate DNA recognition sequence in its 5' upstream region. For example. 
toactivatenapingeneexpressionwthaCSaW-ArTI protein, thenapingene should^ 

sequence (UAS). Construction of such clones is well known in the art and is discussed .nfra. Moreover, those stalled in 
the art will easily recognize that the DNA binding domain component of the chimeric activator proton may be denved 
from any appropriate eukaryotic or prokaryotic source. Thus, fusion genes encoding chimenc AFT1 transcr.pt.onal act- 
vator proteins can be constructed which include virtually any DNA binding domain and the AFT1 franscr.pt.onal I art yator 
provided that the gene placed under the transcriptional control of the AFT1 chimeric activator contains the requisiteDNA 
regulatory sequences which facilrtates its binding. Such chimeric AFT1 transcriptional activator proteins are capable of 
activating transcription efficiently in transgenic plants (plasmidamstrurtond^ 

ing such chimeric AFT1 transcriptional activator proteins, e.g.. AFT1/Gal4. are capable of specifically act.vat.ng and 
overexpressing the desired gene product. 

To identify effective chimeric AFT1 transcriptional activator proteins in vivo or in vrtro. functional analyses are per- 
formed Such assays may be carried out using transiently transformed plant cells or transgenic plants harboring the 
appropriate transgenes. e.g.. an AFT1 /Gal4 transcriptional activator and a storage protein promoter region containing 
the requisite Gal4 DNA binding sequences, according to standard methods (see. e.g.. Gelvm et al.. supra). 

To identify particularly useful combinations, i.a. chimeric AFT1 activators and its cognate genes, plasmids are con- 
structed and analyzed in either transient assays or in vivo in transgenic plants. Construction of chimeric transgenes is 
by standard methods (see. e.g.. Ausubel et al. supra). The wild-type promoter of a specific gene e. g., * e J rua,e ' 
storageprotdn.containingtheregulatoryregiontt^ 

gene to? example, the p-glucuronWase gene (GUS) (see. e.g.. Jefferson. Plant Mol Biol. Rep. 316: 387 1987) in a 
plant expression vector and introduced into a host by any established method (as described .nfra) along with the cognate 
AFT1 chimeric transcriptional activator expression construct. By "reporter gene" is meant a gene whose expression may 
be assayed; such genes include, without limitation, glucuronidase (GUS). lucrferase. chloramphenicolfransacetylase 
(CAT) and B-galactosWase. In one particular example, the expression vector is transformed into Agrobactenumfo l owed 
by transformation of the plant material. e.g.. leaf discs (see. e.g.. Gelvin et al. infra). Regenerated shoots are selected 
on medium containing, e.g.. kanamycin. After rooting, transgenic plantJets are transferred to soil and grown .n a growth 

^"primary transformants are then assayed for chimeric AFT1 - induced GUS activity either by quarttitating GUS activity 
or by histochemical staining as described below. Untransformed plants are taken as controls. 

Fluorometric analysis of GUS activity can be performed in any plant cell protoplast or transgenic plant according to 
standard methodologies. Alternatively, preparations of crude plant extracts can be assayed as describe* by Jef- 
ferson (supra), using extracts standardized forprotein concentration (see. e.g.. Bradford. AnaL Biochera 72: 248 1976). 
GUS levels in different plant tissues are assayed by enzymatic conversion of 4-methylumbelliferyl glucuron.de to 4- 
methylumbelliferone. which is quantified with a f luorimeter (e.g.. PerWn-Elmer LS 2B. Norwalk. CT). Typically, the fluo- 
rimeter is set at 455 run emission and 365 nm excrtation wavelengths. GUS activity is generally expressed as picomoles 
per milligram of protein per minute (see. e.g.. Jefferson supra). 

Alternatively. GUS activity can be assayed by insrju histochemical staining. e.g.. as follows. Whole tissues and thin 
sections from transgenic plants and untransformed control plant tissue can be stained b yj nc ^°?^„!i^1; 
chlorc-3-indoyl p-D-glucuronic acid (X-gluc; Research Organics. Inc.. Cleveland OH) as described by Jefferson et al 
(EMBO J 6- 3901. 1987) and Gallagh r (GUS Protocols. 1992). Tissue sections are incubated at37'C in 2 mM X-gluc 
in 0 1 M NaP0 4 (pH 7.0). and then sectioned: GUS activity in a transformed plant is easily identified by the presence of 
an indigo blue precipitate within the cells expressing the reporter gene. Stained material is optionally examined micro- 
scopically using bright-field and dark-field optics. 
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AFT1 IntPractina Proteins 

Other properties of the AFT1 protein can be explored by modifying the interaction trap system described supra. For 
example, proteins which interact with AFT1 can be isolated and identified. To this end. we used a LexA and partial AFT1 

s fusion protein as a bait (LexA/AFT1 33-194. i.e.. AFT1 residues 33-194 fused to LexA) to search for protons capable 
of interacting wrth AFT1. We identified five novel cDNAs showing sequence homology to several plant : genes, including 
plant defense related gene products. e.g..3-0-meth y ltransferase (see. e g.. Poeydomenge et al. Plant Physiol. 105749- 
750 1994 and Jaek et al.. Mol. Plant-Microbe Interactions 5594-300. 1992) and ascorbate perox.dase (see. e.g.. MrtUer 
et al Plant J 5 397-405. 1994; Mehdy. Plant Physiol. 105:467-472. 1994). the proteasome gene product (see. e.g.. 

w Hatfter et al.. Nucleic Acids Res. 19:5075. 1991). and an ankryin repeating , protein gene product AKR^The nucleotde 
sequences for these cDNAs are shown in Figs. 6 (SEQ ID NO: 17). 8 (SEQ ID NO: 19). 10 (SEQ ID NO. 21), 12 (SEQ 
IDNO- 23) and 14 (SEQ ID NO: 25). Thededuced amino acid sequences coded for by these cDNAsare shown in Figs. 
7^ [SEQ ID NO: 18), 9 (SEQ ID NO: 20). 1 1 (SEQ ID NO: 22). 13 (SEQ ID NO: 24). and 15 (SEQ ID NO: 26). 

»5 AFT1 Polypep tide Expression 

Polypeptides according to the invention may be produced by transformations a suitable host cell with all or part of 
an AFT1 cDNA (e.g.. the cDNA described above) in a suitable expression vehicle or with a plasm.d construct designed 
to express the chimeric AFT1 transcriptional activator protein supra. 

Those skilled in the field of molecular biology will understand that any of a wide vanety of expression systems may 
be used to provide the recombinant protein. The precise host cell used is not critical to the invention. The AFT1 1 protein 
or chimeric activator protein may be produced in a prokaryotic host. e.g.. E. coll. or in a eukaryotic host ag.. Saccha- 
rornyces cerevisiae. mammalian cells (e.g.. COS 1 or NIH 3T3 cells), or any of a number of plant cells including, without 
limitation algae, tree species, ornamental species, temperate fruit species, tropical fruit spec.es. vegetable spec.es 
legume species, monocots. dicots. or in any plant of commercial or agricultural significance. F*t*utar examples of 
suitable plant hosts include Chlamydomonas. Conifers. Petunia. Tomato. Potato. Tobacco. Arab,dops,s. Lettuce ^ Sun- 
flower Oilseed rape. Rax. Cotton. Sugarbeet. Celery. Soybean. Alfalfa. Med.cago. Lotus. Vigna Cucumber. Carrot. 
Eggplant. Cauliflower. Horseradish. Morning Glory. Poplar, Walnut. Apple. Asparagus. Rice. Corn. Millet. On.on. Barley. 
Orchard grass. Oat, Rye. and Wheat. 

Such cells are available from a wide range of sources including: the American Type Culture Collecton (RocWaM 
MD); Chlamydomonas Culture Collection. (Duke University). Durham. North Carolina; or from any of a number. seed 
companies. e.g.. W. Atlee Burpee Seed Co. (Warminster. PA). Park Seed Co. (Greenwood. SC Johnny Seed Co. 
(Albion. ME), or Northrup King Seeds (Harstville. SC). Descriptions and sources of useful host cells are also found ,n 
Vasil I K Cell Culture and Somatic Cell Genetics of Plants. Vol I. II. Ill Laboratory Procedures and Their Appl»catoons 
Academic Press. New York. 1984; Dixon. R.A.. Plant Cell Culture-A Practical Approach. IRL Press. Oxford Unrversrty. 
1985; Green et al.. Plant Tissue and Cell Culture. Academic Press. New York. 1987; Gasser and Fraley. Science 
244*1293 1989 

For prokaryotic expression. DNA encoding an AFT1 polypeptide of the invention is carried on a vector operabTy 
linked to control signals capable of effecting expression in the prokaryotic host If desired, the coding sequence may 
contain, at its 5' end. a sequence encoding any of the known signal sequences capable of effecting secreton of the 
expressed protein into the periplasmic space of the host cell, thereby facilitating recovery of the prote.n and subsequent 
Purification Prokaryotes most frequently used are various strains of E. coli; however, other rmcrobial strams may also 
be used Plasmid vectors are used which contain replication origins, selectable markers, and control sequences denved 
from a species compatible with the microbial host. Examples of such vectors may be found in Pouwels et al. (supra) or 
Ausubel et al. (supra). Commonly used prokaryotic control sequences (also referred to as Regulatory elements") are 
defined herein to include promoters for transcription initiation, optionally with an operator, along with nbosome b.nding 
site sequences. Promoters commonly used to direct protein expression include the beta-lactamase (pen.c.ll.nase) the 
lactose (lac) (Chang et al.. Nature 198: 1056. 1977). the tryptophan (Trp) (Goeddel et al.. Nucl. Ac.ds Re®. 8: 4057. 
1980) and the tac promoter systems as well as the lambda-derived P L promoter and N-gene nbosome b.nd.ng site 
so (Simatake et al.. Nature 292:128, 1981). ^ 
For eukaryotic expression, the method of transformation or transfection and the choice of vehicle for expression of 
the AFT1 polypeptide or chimeric activator protein will depend on the host system selected. Tr^sformatjon i and trans- 
ition me^hcS are described, e.g.. in Ausubel et al. (supra); Weissbach and Weissbach Methods tor Pfcnt Molecular 
Biology. Academic Press. 1989; Gelvin et al.. Plant Molecular Biology Manual. Kluwer Academic Pub shers. 1990, J<m- 
55 die K Proc Natl. Acad. Sci.. USA 87:1228. 1990; Potrykus. I.. Annu. Rev. Plant Physiol. Plant Mol. Biology 42205, 
1991- and BioRad (Hercules. CA) Technical Bulletin #1687 (Biolistic Particle Delivery Systems). Expression vehicles 
may be chosen from those provided: e.g.. in Cloning Vectors: A Laboratory Manual (P.H. ^uwelsj* ah 1985. Supp. 
1987); Gasser and Fraley (supra); Clontech Molecular Biology Catalog (Catalog 1992/93 Tools for the Molecular Biolo- 
gist. Palo Alto. CA); and the references cited above. 
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One preferred eukaryotic egression system is the mouse 3T3 fibroblast host cell transfected with a pMAMneo 
expression vector (Clontech. Palo Alto. CA). pMAMneo provides: an RSV-LTR nhancer linked to a dexamethawne- 
induciWeMMTV-LTR promoter, an SV40 origin of replication which allows replication in mammahan systems^ selectable 
neomycin gene, and SV40 splicing and polyadenylatJon sites. DNA encoding an AFT1 Pdypept.de woulc I be inserted 
intothepMAMneovectorinanorientatjondesigned to allow expression. The recombinant Amprote.n would be. solated 
as described below. Other preferable host cells which may be used in conjunction with the pMAMneo expression veh.de 
include COS cells and CHO cells (ATCC Accession Nos. CRL 1 650 and CCL 61 . respectively). 

Alternatively, an AFT1 polypeptide is produced by a stably-transfected mammalian cell line. A number of vectors 
suitable for stable transfection of mammalian cells are available to the public, e.g.. see Pouwels et a I (supra); methods 
for constructing such cell lines are also publicly available, e.g.. in Ausubel et al. (supra). In one example. cDNA encoding 
the AFT1 polypeptide is cloned into an expression vector which includes the dihydrofolate reductase (DHFR) gene. 
Integration of the plasmid and. therefore, the AFT1-encoding gene into the host cell chromosome is .selected for by 
inclusion of 0.01 -300 fiM methotrexate in the cell culture medium (as described in Ausubel et al.. supra^ This dominant 
selection can be accomplished in most cell types. Recombinant protein expression can be increased by DHFR-mediated 
amplification of the transfected gene. Methods for selecting cell lines bearing gene amplifications are described in 
Ausubel et al. (supra); such methods generally involve extended culture in medium containing gr adually ''"^sjng levels 
of methotrexate. DHFR-containing expression vectors commonly used for this purpose include PCVSEII-DHRF and 
P AdD26SV(A) (described in Ausubel et al.. supra). Any of the host cells described above or preferably, a ^DHFR-d* taent 
CHO cell line (e g.. CHO DHFR-cells. ATCC Accession No. CRL 9096) are among the host cells preferred for DHFR 
selecti()nofastably-transfectedceillineorDHFR-mediatedgeneamplHtoation. .y.^^ 

Most preferably, an AFT1 polypeptide or AFT1 chimeric transcriptional activator is produced by a stably-transf ected 
plant cell line or by a transgenic plant. A number of vectors suitable for stable transfection of plant cells or for the estab- 
lishment of transgenic plants are available to the public; such vectors are described in Pouwels et al. (supra) Weissbach 
and Weissbach (supra), and Gelvin et al. (supra). Methods for constructing such cell lines are described in e.g^ 
Weissbach and Weissbach (supra), and Gelvin et al: (supra). Typically, plant expression vectors include (1) a cloned 
plant gene under the transcriptional control of 5 1 and 3 1 regulatory sequences and (2) a dominant selectable marker. 
Such plant expression vectors may also contain, rf desired, a promoter regulatory region (e.g.. one conferring inducible 
or constitutive, environmentally- or developmentally-regulated. or cell- or tissue-specific expression), a transcnption ini- 
tiation start site, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyade- 

"^SSe 'desired AFT1 nucleic acid sequences is obtained it may be manipulated in a variety of ways known in the 
art. For example, where the sequence involves non-coding flanking regions, the flanking regions maybe subjected to 

mutagenesis. . , , 

The AFT1 DNA sequence of the invention may, if desired, be combined with other DNA sequences in a variety of 
ways The AFT1 DNA sequence of the invention may be employed with all or part of the gene sequences normally 
associated with the AFT1 protein. In its component parts a DNA sequence encoding an AFT1 protein is combined in 
the DNA construct having a transcription initiation control region capable of promoting transcription and translation in a 
host cell 

In general, the constructs will involve regulatory regions functional in plants which provide for modified production 
of AFT1 protein or a chimeric AFT1 protein as discussed supra. The open reading frame coding for the AFT1 protein or 
functional fragment thereof will be joined at its 5' end to a transcription initiation regulatory region such as the sequence 
naturally found in the 5* upstream region of the AFT1 structural gene. Numerous other transcription in.tiat.on reg.ons 
are available which provide for constitutive or inducible regulation. 

For applications when developmental, hormonal or environmental expression is desired appropnate 5 upstream 
non-coding regions are obtained from other genes; for example, from genes regulated during seed development, embryo 
development, or leaf development. . 

Regulatory transcript termination regions may be also be provided in DNA constructs of this invention as welL Tran- 
script termination regions may be provided by the DNA sequence encoding the AFT1 protein or any convenient tran- 
scription termination region dewed from a different gene source, especially the transcrir* termination region which is 
normally associated with the transcript initiation region. The transcript termination region will contain preferably at least 
1 kb preferably about 3 kb of sequence 3* to the structurally gene from which the termination region is derived. Plant 
expression constructs having AFT1 as the DNA sequence of interest for expression thereof may be employed with a 
wide variety of plant life, particularly plant life involved in the production of seed storage proteins or storage lipids useful 
for industrial and agricultural applications. Importantly, this invention is applicable to dicotyledons and monocotyledons, 
and will be readily applicable to any new or improved transformation or regeneration method. 

An exampl of a useful plant promoter according to the invents n is a caulimovirus promoter, .g.. a caulrftower 
mosaic virus (CaMvTp^ confer high levels of expression in most plant fesuesrand the actrvrty 

of these promoters is not dependent on virally encoded proteins. CaMV is a source for both the 35S arrf19S promoters. 
In most tissues of transgenic plants, the CaMV 35S promoter is a strong promoter (see. e.g.. Odell et al.. Nature 313: 
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810 1985). Th CaMV promoter is also highly active in monocots (see. e.g.. Dekeyser et al.. Plant Cell 2:591. 1990; 
Terada and Shimamoto. Mol. Gen. Genet 220:389. 1990). Moreover, activity of this promoter car , be '^r mcreased 
(i e between 2-10 fold) by duplication of the CaMV 35S promoter (see e.g.. Kay et al.. Science 236:1299. 1987; Ow et 
al Proc. Natl. Acad. Sci.. USA 84: 4870. 1987; and Fang et al.. Plant Cell 1: 141. 1989). 

' Other useful plant promoters include, without limitation, the nopaline synthase promoter (An et al.. Plant Phys.ol. 
88- 547 1 988) and the octopine synthase promoter (Fromm et al.. Plant Cell 1 : 977. 1989). 

' For' certain applications, it may be desirable to produce the AFT1 gene product in an approbate tissue, at an 
appropriate level, or at an appropriate developmental time. Thus, there are an assortment of gene promoters each with 
STown distinct characteristics embodied in its regulatory sequences, shown to be regulated ,n response to the environ- 
ment hormones, and/or developmental cues. These include gene promoters that are respons.ble for (1) heat-regulated 
gene'expression (see. e.g.. Callis et al.. Plant Physiol. 88: 965. 1988). (2) light-regulated gene «^<" (f££» J* 
rbcS-3Adescribed by Kuhlemeier et al.. Plant Cell 1 : 471 . 1989; the maize rbcS promoter descnbed by Schaffner and 
Sheen Plant Cell 3- 997. 1991; or the cholorphyll a/b-binding protein gene found in pea described by Simpson et al.. 
EMBo'j 4- 2723. 1985). (3) hormone-regulated gene expression (e.g.. the abscisic acid responsive sequences from 
the Em gene of wheat described by Marcotte et al.. Plant Cell 1 :969. 1989). (4) wound-induced gene expression (e.g.. 
of wunl described by Siebertz et al. Plant Cell 1 : 961 . 1989). or (5) organ-specific gene expression (e.g.. of the tuber- 
specific storage protein gene described by Roshal et al.. EMBO J. 6:1 155. 1987; the 23-kDa ze,n gene from maize 
described by Schernthaner et al.. EMBO J. 7: 1249. 1988; or the French bean pphaseolin gene described by Bustos 

* ^PtalrteBoreKMW vecSmay also optionally include RNA processing signals. e.g. introns. which have been shown 
to be important for efficient RNA synthesis and accumulation (Callis et al.. Genes and Dev. 1 : 1 183. 1987). The tocaton 
of the RNA splice sequences can dramatically influence the level of transgene expression in plants. In view of this fact, 
an intron may be positioned upstream or downstream of a AFT1 polypeptide-encoding sequence in the transgene to 
modulate levels of gene expression. . 

In addition to the aforementioned 5' regulatory control sequences, the expression vectors may also incWe regulatory 
control regions which are generally present in the 3' regions of plant genes (Thornburg et aL. Proc 
84- 744 1 987- An et al.. Plant Cell 1 : 1 1 5, 1 989). For example, the 3* terminator region may be included in the expression 
vector to increase stability of the mRNA. One such terminator region may be derived from the Pl-ll terminator region of 
potato In addition, other commonly used terminators are derived from the octopine or nopaline synthase signals. 

The plant expression vector also typically contains a dominant selectable marker gene used to identify those cells 
that have become transformed. Useful selectable genes for plant systems include genes encoding antibiotic resistance 
genes, for example, those encoding resistance to hygromycin. kanamycin. bleomycin. G418. ^eptomycin or spert.no- 
mycin Genes required for photosynthesis may also be used as selectable markers in photosynthetc-def .cent strains. 
Finally, genes encoding herbicide resistance may be used as selectable markers; useful herbicide resistance genes 
include the bar gene encoding the enzyme phosphinothricin acetyltransferase and conferring resistance to the broad 
spectrum herbicide Basta® (Hoechst AG. Frankfurt. Germany). , llar 

Efficient use of selectable markers is facilitated by a determination of the susceptibility of a plant cell to a Particular 
selectable agent and a determination of the concentration of this agent which effectively kills most rf not all. of the 
transformed cells. Some useful concentrations of antibiotics for tobacco transformation include, e.g.. 75-100 ng/rnl (kan- 
amycin). 20-50 ng/ml (hygromycin). or 5-10 (ig/ml (bleomycin). A useful strategy for selection of transformants for her- 
bicide resistance is described, e.g.. by Vasil et al., supra. 

It should be readily apparent to one skilled in the art of molecular biology, especially in the field of plant molecular 
biology, that the level of gene expression is dependent, not only on the combination of promoters. RNA processing 
signals and terminator elements, but also on how these elements are used to increase the levels of selectable marker 
gene expression. 

Plant Transformation 

Upon construction of the plant expression vector, several standard methods are accessible for introduction of the 
recombinant genetic material into the host plant for the generation of a transgenic plant. These methods include (1) 
Agrobacterium-mediated transformation (A. tumefaciens or A. rhizogenes) (see. e.g.. Lichtenstein and Fuller n: Genetic 
Engineering, vol 6. PWJ Rigby. ed. London. Academic Press. 1987; and Lichtenstein. C P., and Draper. J ln:DNA 
Cloning Vol II. D.M. Glover, ed. Oxford. IRI Press. 1985). (2) the particle delivery system (see. e.g.. Gordon-Kamm e 
al Plant Cell 2 603 1990; or BioRad Technical Bulletin 1687. supra). (3) microinjection protocols (see. e.g.. Green et 
al'.". supra). (4) polyethylene glycol (PEG) procedures (see. e.g.. Draper etal.. Plant Cell Physiol. 23:451. 1982; or ag 
Zhang and Wu. Theor. Appl. Genet 76:835. 1988). (5) liposome-mediated DNA uptake (see. e.g.. Freeman et al.. Plant 
Cell Physiol. 25: 1353. 1984); (6) electroporation protocols (see, ^ e g:. GeMn et al.. supra; Deteyser et^l. T s^ra/or 
Fromm et al.. Nature 319: 791. 1986). and (7) the vortexing method (see. e.g.. Kindle supra). The method of transfor- 
mation is not critical to the instant invention; various method of plant transformation are currently available (supra). As 



12 



EP 0 693 554 A1 

newer methods are available to transform crops or other host cells they may be directly applied. Accordingly, a wide 
variety of methods have b en developed to insert a DNAsequence into the gene of a plant host to obtain the transcription 
or transcript and translation of the sequence to effect phenotypic changes in both dicots and monocots. Moreover, the 
manner in which the DNA construct is introduced into the plant host is not critical to the invention. Thus, any method 
which provides for efficient transformation maybe employed. 

The following is an example outlining an Agrobacterium-mediated plant transformation. The general process for 
manipulating genes to be transferred into the genome of plant cells is carried out in two phases. First, all the cloning 
and DNA modification steps are done in E. coli, and the plasmid containing the gene construct of interest is transferred 
by conjugation into Agrobacterium. Second, the resulting Agrobacterium strain is used to transform plant cells. Thus, 
for the generalized plant expression vector, the plasmid contains an origin of replication that allows it to replicate in 
Agrobacterium and a high copy number origin of replication functional in E. coli. This permits facile production and testing 
of transgenes in E. coli prior to transfer to Agrobacterium for subsequent introduction into plants. Resistance genes can 
be carried on the vector, one for selection in bacteria, e.g.. streptomycin, and the other that will express in plants, e.g., 
a gene encoding for kanamycin resistance or an herbicide resistance gene. Also present are restriction endonuclease 
sites for the addition of one or more transgenes operably linked to appropriate regulatory sequences and directional T- 
DNA border sequences which, when recognized by the transfer functions of Agrobacterium, delimit the region that will 
be transferred to the plant. . 

In another example, plants cells may be transformed by shooting into the cell tungsten microprojectiles on which 
cloned DNA is precipitated. In the Biolistic Apparatus (Bio-Rad, Hercules, C A) used for the shooting, a gunpowder charge 
(22 caliber Power Piston Tool Charge) or an air<Jriven blast drives a plastic macroprojectile through a gun barrel. An 
aliquot of a suspension of tungsten particles on which DNA has been precipitated is placed on the front of the plastic 
macroprojectile. The latter is fired at an acrylic stopping plate that has a hole through it that is too small for the macro- 
projectile to go through. As a result, the plastic macroprojectile smashes against the stopping plate and the tungsten 
microprojectiles continue toward their target through the hole in the plate. For the instant invention the target can be any 
plant cell, tissue, seed, or embryo. The DNA introduced into the cell on the microprojectiles becomes integrated into 
either the nucleus or the chloroplast 

Transfer and expression of transgenes in plant cells is now routine practice to those skilled in the art. It has become 
a major tool to carry out gene expression studies and to attempt to obtain improved plant varieties of agricultural or 
commercial interest 

Transgenic Plant Regeneration 

Plants cells transformed with a plant expression vector can be regenerated, e.g., from single cells, callus tissue or 
leaf discs according to standard plant tissue culture techniques. It is well known in the art that various cells, tissues, and 
organs from almost any plant can be successfully cultured to regenerate an entire plant; such techniques are described, 
e g in Vasil supra; Green et al., supra; Weissbach and Weissbach, suprq; and Gelvin et al., supra 

In one particular example, a cloned AFT1 polypeptide under the control of the 35S CaMV promoter and the nopahne 
synthase terminator and carrying a selectable marker (e.g., kanamycin resistance) is transformed into Agrobacterium. 
Transformation of leaf discs (e.g.. of tobacco leaf discs), with vector-containing Agrobacterium is carried out as described 
by Horsch et al. (Science 227: 1229, 1985). Putative transformants are selected after a few weeks (e.g.. 3 to 5 weeks) 
on plant tissue culture media containing kanamycin (e.g. 100 fig/ml). Kanamycin-resistant shoots are then placed on 
plant tissue culture media without hormones for root initiation. Kanamycin-resistant plants are then selected for green- 
house growth. If desired, seeds from self-fertilized transgenic plants can then be sowed in a soil-less media and grown 
in a greenhouse. Kanamycin-resistant progeny are selected by sowing surfaced sterilized seeds on hormone-free kan- 
amycin-containing media. Analysis for the integration of the transgene is accomplished by standard techniques (see. 
e.g.. Ausubeletal. supra; Gelvin etal. supra). 

Transgenic plants expressing the selectable marker are then screened for transmission of the transgene DNA by 
standard immunoblot and DNA detection techniques. Each positive transgenic plant and its transgenic progeny are 
unique in comparison to other transgenic plants established with the same transgene. Integration of the transgene DNA 
into the plant genomic DNA is in most cases random and the site of integration can profoundly effect the levels, and the 
tissue and developmental patterns of transgene expression. Consequently, a number of transgenic lines are usually 
screened for each transgene to identify and select plants with the most appropriate expression profiles. 

Transgenic lines are evaluated on levels of transgene expression. Expression at the RNA level is determined initially 
to identify and quantitate expression-positive plants. Standard techniques for RNA analysis are employed and include 
PCR amplification assays using oligonucleotide primers designed to amplify only transgene RNA templates and solution 
hybridization assays using transgene-specific probes (see, e.g., Ausubei et al., supra). The RNA-positive plants ar then 
analyzed for protein expression by-Western immunoblot analysis using AFT-1 specific antibodies (see r e.g., Ausubei et 
al supra) In addition, in situ hybridization and immunccytcchemistry according to starxJard protocols can be done u^ 
transgene-specif ic nucleotide probes and antibodies, respectively, to localize sites of expression within transgenic tissue. 
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Once the recombinant AFT1 protein is expressed in any cell or in a transgenic plant (e.g.. as described above) Jt 
may be isolated, .g.. using affinity chromatography. In one example, an anti-AFT1 antibody (e.g.. produced as described 
in Ausubel et al.. supra, or by any standard technique) may be attached to a column and used to isolate the polypeptide. 
Lysis and fractionation of AFT1 -producing cells prior to affinity chromatography may be performed by standard methods 
(see e g Ausubel et al.. supra). Once isolated, the recombinant protein can, if desired, be further purified, e.g.. by high 
performance liquid chromatography (see. e.g.. Fisher, Laboratory Techniques In Biochemistry And Molecular Biology, 
eds., Work and Burdon, Elsevier, 1980). 

These general techniques of polypeptide expression and purification can also be used to produce and isolate useful 
AFT1 fragments or analogs. . 

In other applications, however, expression of the transgene in the plant cell or the transgenic plant may be the desired 
result. These include applications such as AFT1 controlled regulation of modulating plant defense related proteins. e.g., 
3-O-methyltransferase or ascorbate peroxidase, or altering the normal development of the plant. 

Use 

Introduction of AFT1 or a chimeric AFT1 transcriptional activator into a transformed plant cell facilitates the manip- 
ulation of developmental events. For example, transgenic plants of the instant invention expressing AFT1 or an AFT1 
chimeric transcriptional activator might be used to alter/simply and inexpensively, or regulate plant gene expression, 
e.g., plant defense mechanism, expression of plant storage components, or any number of other plant developmental 
20 events. 

flthar Embodiments 

The invention also includes any biologically active fragment or analog of a crucifer AFT1 protein. By "biologically 
active" is meant possessing any in vivo or in vitro activity which is characteristic of the crucrfer AFT1 polypeptide shown 
in Fig 1 (SEQ ID N02). Because crucifer AFT1 protein exhibits a range of physiological properties and because such 
properties may be attributable to different portions of the crucifer AFT1 protein molecule, a useful AFT1 fragment or 
analog is one which exhibits a biological activity in any biological assay for AFT1 transcriptional activation or binding 
activity, for example, those assays described supra. Such fragment or analog may function in accordance with develop- 
mental'stages of different cell types and in response to different environmental factors and hormonal cues, or in response 
to a particular signal transduction pathway. 

Preferred analogs include AFT1 proteins (or biologically active fragments thereof) whose sequences differ from the 
wild-type sequence only by conservative amino acid substitutions, for example, substitution of one amino acid for another 
with similar characteristics (e.g.. valine for glycine, arginine for lysine, etc.) or by one or more non-conservative ammo 
35 acid substitutions, deletions, or insertions which do not abolish the polypeptide's biological activity. 

Analogs can differ from naturally occurring AFT1 protein in amino acid sequence or can be modified in ways that 
do not involve sequence, or both. Analogs of the invention will generally exhibit at least 70%, preferably 80%, more 
preferably 90%. and most preferably 95% or even 99%, homology with a segment of 20 amino acid residues, preferably 
40 amino acid residues, or more preferably the entire sequence of a naturally occurring AFT1 polypeptide sequence. 
40 Alterations in primary sequence include genetic variants, both natural and induced. Also included are analogs that 
include residues other than naturally occurring L-amino acids. e.g.. D-amino acids or non-naturally occurring or synthetic 
amino acids e g.. p oryaminoacids. Alternatively, increased stability may be conferred by cyclizing the peptide molecule. 
Also included in the invention are crucifer AFT1 proteins modified by in vivo or in vitro chemical derivatization of polypep- 
tides including acetylation, methylation, phosphorylation, carboxylation, or glycosylate. 
45 In addition to substantially full-length polypeptides, the invention also includes biologically active fragments of the 
polypeptides. As used herein, the term -fragment", as applied to a polypeptide, will ordinarily be at least 20 residues, 
more typically at least 40 residues, and preferably at least 60 residues in length. Fragments of crucifer AFT1 proteins 
can be generated by methods known to those skilled in the art The ability of a candidate fragment to exhibit a biological 
activity of crucifer AFT1 protein can be assessed by those methods described herein. Also included in the invention are 
so crucrfer AFT1 proteins containing residues that are not required for biological activity of the peptide, e.g., those added 
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by alternative mRNA splicing or alternative protein processing events. 
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(1) ENERAL INFORMATION : 

(i) APPLICANT: 

(ii) TITLE OF INVENTION: 

(iii) NUMBER OF SEQUENCES : 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: 

(B) STREET: 
<C) CITY: 

(D) STATE: 

(E) COUNTRY: 

(F) ZIP: 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: 

(vi) CURRENT APPLICATION DATA: 

<A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



Zhang et al. 

CRUCIFER AFT PROTEINS AND USES 
THEREOF 
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U.S.A. 
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3.5" Diskette, 1.44 Mb 
IBM PS/2 Model 50Z or 55SX 
MS-DOS (Version 5.0) 
WordPerfect (Version 5.1) 
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(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 



Lech, Karen F. 
35,238 

00786/219001 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 

(B) TELEFAX: 

(C) TELEX: 



(617) 542-5070 
(617) 542-8906 
200154 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 



845 

nucleic acid 

single 

linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
AAAAAAAAAT CAAATCTCTC TCTTTCTCTC TCTAATGGCG GCGACATTAG GCAGAGACCA 
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(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 2: 

(1) SEQUENCE CHARACTERISTICS X 

(A) LENGTH x 248 

30 (B) TYPEx amino acid 

(C) STRANDEDNESSt single 

<D) TOPOLOGY x linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

35 Met Ala Ala Thr Leu Gly Arg Asp Gin Tyr Val Tyr Met Ala Lys Leu 

15 10 15 

Ala Glu Gin Ala Glu Arg Tyr Glu Glu Met Val Gin Phe Met Glu Gin 
20 25 30 

40 Leu Val Thr Gly Ala Thr Pro' Ala Glu Glu Leu Thr Val Glu Glu Arg 

35 40 45 

Aen Leu Leu Ser Val Ala Tyr Lya Aan Val He Gly Ser Leu Arg Ala 
SO 55 60 

45 Ala Trp Arg He Val Ser Ser He Glu Gin Lys Glu Glu Ser Arg Lys 

65 70 75 80 

Asn Asp Glu His Val Ser Leu Val Lys Asp Tyr Arg Ser Lys Val Glu 
85 90 95 

50 



GTATGTGTAC ATGGCGAAGC TCGCCGAGCA GGCGGAGCGT TACGAAGAGA TGGTTCAATT 120 

CATGGAACAG CTCGTTACAG GCGCTACTCC AGCGGAAGAG CTCACCGTTG AAGAGAGGAA 180 

TCyCCTCTCT GTTGCTTACA AGAACGTGAT CGGATCTCTA CGCGCCGCCT GGAGGATCGT 240 

GTCTTCGATT GAGCAGAAGG AAGAGAGTAG GAAGAACGAC GAGCACGTGT CGCTTGTCAA 300 

GGATTACAGA TCTAAAGTTG AGTCTGAGCT TTCTTCTGTT TGCTCTGGAA TCCTTAAGCT 360 

CCTTGACTCG CATCTGATCC CATCTGCTGG AGCGAGTGAG TCTAAGGTCT TTTACTTGAA 420 

GATGAAAGGT GATTATCATC GGTACATGGC TGAGTTTAAG TCTGGTGATG AGAGGAAAAC 480 

TGCTGCTGAA CATACCATGC TCGCTTACAA AGCAGCTCAG GATATCGCAG CTGCGGATAT 540 

GGCACCTACT CATCCGATAA GGCTTGGTCT GGCCCTGAAT TTCTCAGTGT TCTACTATGA 600 

GATTCTCAAT TCTTCAGACA AAGCTTGTAA CATGGCCAAA CAGGCTTTTG AGGAGGCCAT 660 

AGCTGAGCTT GACACTCTGG GAGAGGAATC CTACAAAGAC AGCACTCTCA TAATGCAGTT 720 

GCTGAGGGAC AATTTAACCC TTTGGACCTC CGATATGCAG GAGCAGATGG ACGAGGCCTG 780 

AGGATCTAGA TGAAGGGGGG GAGGGTTGTT ACGCGATGTT TCTGCCACCA AATCGATCTC 840 
AAAAT 
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Ser Glu Leu Ser Ser Val Cys Ser Gly He Leu Lys Leu Leu Asp Ser 
100 105 1 

Hie Leu lie Pro Ser Ala Gly Ala Ser Glu Ser Lys Val Phe Tyr Leu 
X15 120 

Lys Met Lys Gly Asp Tyr His Arg Tyr Met Ala Glu Phe Lys Ser Gly 



130 



Asp Glu Arg Lys Thr Ala Ala Glu Asp Thr Met Leu Ala Tyr Lys Ala 
145 150 « a 

Ala Gin Asp lie Ala Ala Ala Asp Met Ala Pro Thr His Pro lie Arg 

165 I 70 

L eu Gly Leu Ala Leu Asn Phe Ser Val Phe Tyr Tyr Glu lie Leu Asn 
180 185 ^ A * u 

Ser Ser Asp Lys Ala Cys Asn Met Ala Lys Gin Ala Phe Glu Glu Ala 
195 200 * u => 

lie Ala Glu Leu Asp Thr Leu Gly Glu Glu Ser Tyr Lys Asp Ser Thr 
210 215 220 

Leu lie Met Gin Leu Leu Arg Asp Asn Leu Thr Leu Trp Thr Ser Asp 
225 230 235 

248 

Met Gin Glu Gin Met Asp Glu Ala 
245 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 3: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 , . . . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 <D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCGGAATTCA TGAGGCCCAT TAAAATT 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 4: 
(i) SEQUENCE CHARACTERISTICS t 

45 (A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
GTAGGATCCG GTCGGATTTC TTGTCGC 



27 



27 



17 



1 



EPO 693 554 A1 



10 



45 



50 



27 



28 



(2) INF RMATION F R SEQUENCE IDENTIFICATION NUMBER: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CGCGAATTCA ATAGCGACAA GTACGAT 

15 (2 ) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
25 GTAGGATCCG TCTCTCTTCC AAGGTAGA 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 7: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GATCCTAGAA TTCAAGAAGA ATCGGCGTGG C 

40 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 

(B) TYPE : nucleic acxd 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CTGACTGAAT TCATGGCGGC GACATTAGG 29 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 9: 
(i) SEQUENCE CHARACTERISTICS: 

55 
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(A) LENGTH: 29 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY t linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



GACTGAGTCG ACCCTTCATC TAGATCCTC 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 10 
(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDKESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



GACTGACTCG AGCCTTCATC TAGATCCTCA 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 11 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 

(B) TYPE: nucleic acid 

(C) ST RAND ED NESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



CTGACTGAAT TCGAGTCTAA GGTCTTTAC 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 12 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDKESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GACTGACTCG AGACTCGCTC CAGCAGATGG 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 13 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



19 



1 



EP 0 693 554 A1 



10 



20 



25 



30 



35 



40 



45 



50 



55 



(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

dACTGACTCG AGTGAAGAAT TGAGAATCTC 30 

<2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 14: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:^ 
GACTGAGTCG ACACTCGCTC CAGCAGATGG 30 



<2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GACTGAGTCG ACTGAAGAAT TGAGAATCTC 30 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CTGACTGAAT TCGTTACAGG CGCTACTCCA G 3 * 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 17: 
(£) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 567 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTI Nl SEQ ID NO: 17: 

5 TCACCCAGAG AGGTCAGGCT TTGATGGACC ATGGACCCAA GAGCCGCTGA AGTTTGACAA 60 

CTCCTACTTC GTGGAACTGC TGAAAGGAGA ATCAGAGGGC TTGTTGAAAC TTCCAACTGA 120 

CAAGACCTTA TTGGAAGACC CGGAGTTCCG TCGTCTTGTT GAGCTTTATG CAAAGGATGA 180 

W AGATGCATTC TTCAGAGACT ACGCGGAATC GCACAAGAAA CTCTCTGAGC TTGGTTTCAA 240 

CCCAAACTCC TCAGCAGGCA AAGCAGTTGC AGACAGCACG ATTCTGGCAC AGAGTGCGTT 300 

CGGGGTTGCA GTTGCTGCTG CGGTTGTGGC ATTTGGTTAC TTTTACGAGA TTCGGAAGAG 360 

75 GATGAAGTAA ACGAAATAGG AAGGAAAACA CGAAGCAACG ATGCTCTTAT TTGGGTATTA 420 

AAGAAACTAT TAATCGTCTA TCGAATCTAT TTTGCTGCTA CAAGATTCTA AACTCTTTGA 480 

ATCCACGATT CCACTGTTTA GTAGTAAAAA AGTTAAAAAG TCAATATTTT GGGTCCGTGA 540 

20 TTCATTTTTG CGATAAA 557 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 18: 
(i) SEQUENCE CHARACTERISTICS: 



25 



(A) LENGTH: 122 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (x ij SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

His Pro Glu Arg Ser Gly Phe Asp Gly Pro Trp Thr Gin Glu Pro Leu 

15 10 15 



35 



40 



45 



50 



55 



Lys Phe Asp Asn Ser Tyr Phe Val Glu Leu Leu Lys Gly Glu Ser Glu 
20 25 30 

Gly Leu Leu Lys Leu Pro Thr Asp Lys Thr Leu Leu Glu Asp Pro Glu 
35 40 45 

Phe Arg Arg Leu Val Glu Leu Tyr Ala Lys Asp Glu Asp Ala Phe Phe 
50 55 60 

Arg Asp Tyr Ala Glu Ser His Lys Lys Leu Ser Glu Leu Gly Phe Asn 
65 70 75 80 

Pro Asn Ser Ser Ala Gly Lys Ala Val Ala Asp Ser Thr He Leu Ala 
85 90 95 

Gin Ser Ala Phe Gly Val Ala Val Ala Ala Ala Val Val Ala Phe Gly 
100 105 HO 

Tyr Phe Tyr Glu He Arg Lys Arg Met Lys 122 
115 120 
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(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 19: 
<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 478 

(B) TYPE: nucliec acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


19: 






GAGTGACGAA 


CATTGCGTGA 


AATTCTTGAA 


GAACTGCTAC 


GAGTCACTTC 


CAGAGGATGG 


60 


AAAAGTGATA 


TTAGCAGAGT 


GTATTCTTCC 


AGAGACACCA 


GACTCAAGCC 


TCTCAACCAA 


120 


ACAAGTAGTC 


CATGTCGATT 


GCATTATGTT 


GGCTCACAAT 


CCCGGAGGCA 


AAGAACGAAC 


180 


CGAGAAAGAG 


TTTGAGGCAT 


TAGCCAAAGC 


ATCAGGCTTC 


AAGGGCATCA 


AAGTTGTCTG 


240 


CGACGCTTTT 


GGTGTTAACC 


TTATTGAGTT 


ACTCAAGAAG 


CTCTAAAAAC 


AAACAATGTT 


300 


CCTATGAAGA 


TGATTTATAT 


GTAAACATTA 


TCTCATATCT 


CCTTCCACGG 


TTCCAAAACT 


360 


ATGCTGTTTA 


ATAATGGTTT 


TTACAAGAAT 


TTGATTATGA 


GTTTGTATTT 


TTGTTTGTTT 


420 


GGAACAAAAT 


TATGTGATTA 


TAGGGAAAAA 


TAAAATGAGC 


TATTATTGAA 


GAAAAAAA 


478 



<2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Ser Aap Glu His Cya Val Lya Phe Leu Lys Asn Cys Tyr Glu Ser Leu 
15 10 15 

Pro Glu Asp Gly Lys Val lie Leu Ala Glu Cys lie Leu Pro Glu Thr 
20 25 30 

Pro Asp Ser Ser Leu Ser Thr Lys Gin Val Val His Val Asp Cys He 
35 40 45 

Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu Lys Glu Phe 
50 55 60 

Glu Ala Leu Ala Lys Ala Ser Gly Phe Lys Gly He Lys Val Val Cys 
65 70 75 80 

As© Ala Phe Gly Val Asn Leu He Glu Leu Leu Lya Lys Leu 
85 90 
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(2) IKFORMATI N FOR SEQUENCE IDENTIFICATION NUMBERS 21: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH 1 1357 

(B) TYPE i nucleic acid 

(C) STRANDEDNESS s single 

(D) TOPOLOGY I linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CCAGATTATC CCTCCCCCGA ATTCGGCACG AGGAAAAATC CTCTTCTTTC AGATGAGAAA 60 

CCCAAATCGA CGGAGGAGAA TAAGAGTTCT AAGCCGGAAT CACCTTCTGG GAGTTCAACT 120 

15 TCATCAGCTA TGCCTGGCTT GAATTTCAAT GCTTTTGATT TCTCTAATAT GGCTAGTATT 180 

CTCAACGATC CTAGCATCAG AGAAATGGCT GAGCAAATAG CTAAAGATCC TGCCTTTAAC 240 

CAATTGGCTG AGCAGCTTCA GAGATCTATT CCTAACGCTG GCCAGGAAGG TGGTTTCCCT 300 

20 AACTTTGATC CTCAACAGTA TGTCAATACA ATGCAACAGG TTATGCATAA CCCTGAGTTT 360 

AAGACAATGG CCGAGAAACT TGGTACCGCC TTAGTTCAGG ATCCACAAAT GTCTCCTTTT 420 

TTGGATGCTT TCTCGAATCC TGAAACAGCA GAACACTTTA CTGAGCGTAT GGCGCGGATG 480 

AAAGAAGATC CAGAGTTGAA ACCTATACTA GATGAGATTG ATGCTGGTGG TCCTTCTGCC 540 

ATGATGAAGT ACTGGAATGA TCCAGAAGTG CTGAAAAAGC TGGGTGAAGC AATGGGTATG 600 

CCTGTTGCTG GCTTACCAGA CCAGACTGTT TCAGCTGAAC CTGAGGTAGC AGAAGAAGGT 660 

GAAGAAGAAG AGTCTATTGT TCACCAAACT GCCAGTCTTG GTGATGTTGA GGGTTTGAAA 720 

GCTGCCTTGG CATCTGGTGG TAACAAAGAT GAAGAAGATT CTGAAGGAAG GACAGCATTG 780 

CATTTTGCTT GTGGATACGG CGAGTTGAAA TGTGCTCAAG TTCTTATCGA TGCTGGAGCA 840 

AGTGTTAATG CGGTTGACAA AAACAAGAAC ACACCTCTGC ATTATGCTGC TGGTTACGGG 900 

AGGAAAGAGA GTGTAAGCCT TCTCCTGGAG AATGGTGCTG CAGTCACTCT GCAAAACCTA 960 

GACGAGAAGA CGCCAATTGA TGTAGCGAAG CTCAACAGCC AGCTGGAGGT GGTGAAGCTG 1020 

CTTGAGAAGG ATGCTTTCCT TTGAGCTCTG CTGGTTAAAG GAAAGCTCTA AGCTCATATT 1080 

GTCTTTGAGG CATTTGTCTT GTGTGTGTCC TGAACCAGTT TCACAGGCTT TTTGTGTACA 1140 

CTTTTTATTA GTTCCTCTCT TCTTCTAAAT TTGTCTCTTA TGTTGTTTTA AAAGTCAATA 1200 

AAGAAAGAAA TAGCAATCAA TGATTTAATT TATGATTATA TTCTTTATTT CGTCGACCTC 1260 

45 TACAGAATGA TTCAATTTGG AAGAATCATT CTGGTTTGGA GGATATGTAA GAAAAACTAC 1320 

TTGATCTCCA AGTTATTCCA TTCTTCTGTT GAAAAAA 1357 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 22: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 
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(B) TYPE: amino acid 

<C) STRANDEDNESSl single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



Gly Thr Arg Lys Asn Pro Leu Leu Ser Asp Glu Lys Pro Lys Ser Thr 
1 5 10 15 

Glu Glu Asn Lys Ser Ser Lys Pro Glu Ser Ala Ser Gly Ser Ser Thr 
20 25 30 

Ser Ser Ala Met Pro Gly Leu Asn Phe Asn Ala Phe Asp Phe Ser Asn 
35 40 45 

Met Ala Ser He Leu Asn Asp Pro Ser He Arg Glu Met Ala Glu Gin 
50 55 60 

He Ala Lys Asp Pro Ala Phe Asn Gin Leu Ala Glu Gin Leu Gin Arg 

e; 80 



65 



70 



75 



Ser He Pro Asn Ala Gly Gin Glu Gly Gly Phe Pro Asn Phe Asp Pro 
85 90 95 

Gin Gin Tyr Val Asn Thr Met Gin Gin Val Met His Asn Pro Glu Phe 
100 105 HO 

Lys Thr Met Ala Glu Lys Leu Gly Thr Ala Leu Val Gin Asp Pro Gin 
115 120 125 

Met Ser Pro Phe Leu Asp Ala Phe Ser Asn Pro Glu Thr Ala Glu His 
130 135 140 

Phe Thr Glu Arg Met Ala Arg Met Lys Glu Asp Pro Glu Leu Lys Pro 
145 150 155 1^0 

He Leu Asp Glu He Asp Ala Gly Gly Pro Ser Ala Met Met Lys Tyr 
165 170 l' 5 

Trp Asn Asp Pro Glu Val Leu Lys Lys Leu Gly Glu Ala Met Gly Met 
* 180 185 190 

Pro Val Ala Gly Leu Pro Asp Gin Thr Val Ser Ala Glu Pro Glu Val 
195 200 205 

Ala Glu Glu Gly Glu Glu Glu Glu Ser He Val His Gin Thr Ala Ser 
210 215 220 

Leu Gly Ab P Val Glu Gly Leu Lys Ala Ala Leu Ala Ser Gly Gly Asn 
225 230 235 240 

Lys Asp Glu Glu Asp Ser Glu Gly Arg Thr Ala Leu His Phe Ala Cys 



245 



250 



Gly Tyr Gly Glu Leu Lys Cys Ala Gin Val Leu He Asp Ala Gly Ala 
260 265 270 

Ser Val Asn Ala Val Asp Lys Asn Lys Asn Thr Pro Leu His Tyr Ala 
275 280 285 

Ala Gly Tyr Gly Arg Lys Glu Ser Val Ser Leu Leu Leu Glu Asn Gly 
290 295 300 
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w 



15 



30 



35 



40 



50 



55 



Ala Ala Val Thr Leu Gin Aen Leu Asp Glu Lys Thr Pro He Asp Val 
305 310 315 320 

Ala Lye Leu Asn Ser Gin Leu Glu Val Val Lya Leu Leu Glu Lys Asp 
325 330 335 

339 

Ala Phe Leu 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 23: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 663 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: v. 



(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 

(B) TYPE: amino acxd 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Lys He Leu Pro Ser Thr Val Asp Val Pro Pro Lys Gly Gly Phe Ser 

10 15 



1 5 



Phe Asp Leu Cys Lys Arg Asn Asp He Leu Thr Gin Lys Gly Leu Lys 
20 25 30 



60 
120 



20 TTTTAAAAAA TTTTGCCATC AACCGTAGAT GTTCCGCCAA AGGGTGGGTT TAGCTTCGAT 

CTGTGTAAGA GAAATGATAT TCTTACACAA AAGGGTCTTA AAGCTCCGTC TTTTTTGAAG 

ACTGGAACAA CCATTGTTGG TTTGATTTTC AAGGATGGTG TGATACAAGG GGCAGATACC 180 

25 CGAGCAACTG AGGGGCCAAT TGTTGCTGAT AAGAACTGTG AGAAGATTCA CTATATGGCA 240 

CCAAACATAT ATTGCTGTGG TGCAGGAACT CGGGCTGATA CTGAAGCAGT CACTGATATG 300 

GTCAGCTCAC AGCTGCGATT GCATCGTTAC CAGACTGGTC GAGACTCTCG GGTCATTACT 360 

GCTTTGACCC TTCTCAAAAA ACATTTTTTC AGCTACCAAG GTCATGTCTC TGCTGCTCTT 420 

GTACTCGGTG GAGTTGATAT CACTGGTCCA CATCTGCATA CTATATACCC ACACGGTTCA 480 

ACTGACACTC TTCCATTCGC CACAATGGGT TCGGGTTCTC TTGCTGCTAT GTCTGTGTTT 540 

GAGGCAAAGT ATAAAGAAGG CCTAACTAGG GATGAAGGAA TTAAGCTGGT CGCTGAATCC 600 
ATATGCTCGG GTATATCCAA TGACCTGGGT AGTGGTAGCA ACGTGGACAT CTGCGTGATC 
ACA 



660 
663 
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Ala Pro Ser Phe Leu Lys Thr Gly Thr Thr lie Val Gly Leu lie Phe 
35 40 45 

5 Lys Asp Gly Val He Gin Gly Ala Asp Thr Arg Ala Thr Glu Gly Pro 

50 55 60 

He Val Ala Asp Lys Aan Cys Glu Lys He His Tyr Met Ala Pro Asn 
65 70 75 80 

10 no Ty r cys Cys Gly Ala Gly Thr Arg Ala Asp Thr Glu Ala Val Thr 

85 90 95 

Asp Met Val Ser Ser Gin Leu Arg Leu His Arg Tyr Gin Thr Gly Arg 
100 105 110 

is Asp Ser Arg Val He Thr Ala Leu Thr Leu Leu LyB Lys His Phe Phe 

115 120 125 

Ser Tyr Gin Gly His Val Ser Ala Ala Leu Val Leu Gly Gly Val Asp 
130 135 140 

He Thr Gly Pro His Leu His Thr He Tyr Pro His Gly Ser Thr Asp 
20 145 150 155 160 

Thr Leu Pro Phe Ala Thr Met Gly Ser Gly Ser Leu Ala Ala Met Ser 
165 170 175 

Val Phe Glu Ala Lys Tyr Lys Glu Gly Leu Thr Arg Asp Glu Gly He 
25 180 185 190 

Lys Leu Val Ala Glu Ser He Cys Ser Gly He Ser Asn Asp Leu Gly 
195 200 205 



30 



Ser Gly Ser Asn Val Asp He Cys Val He Thr 219 
210 215 

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER! 25: 
(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 976 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY i linear 

(xi) £ 

40 



45 



50 



55 



(Xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


25: 






ACGAGAGGCC 


CTGAGACGCG GCAGATATCA 


GGTCCTGCGA 


CTTCAACACA 


GATCAGGAAC 


60 


TTCACATTAT 


CTCAGCATCT GCAAGGAATC 


CACACACATA 


TCTCATCCAT 


GCTAGCGGAC 


120 


CTTCCCAGTA 


TTGCTACTGA TGTATTGTCT 


CCTTATCTGG 


CTGCAATCTA 


TAATGCGGCA 


180 


TGTGAGCCAG 


TTACACCTTT GTTTAAAGCA 


ATGCGAGACA 


AGCTCGAGTC 


ATGCATTCTT 


240 


CAAATCCATG 


ATCAAAACTT TGGTGCTGAT 


GACGCTGACA 


TGGACAACAA 


CGCTTCCTCA 


300 


TACATGGAGG 


AGTTGCAGAG ATCGATTCTT 


CACTTCCGCA 


AGGAGTTCCT 


ATCTAGACTA 


360 


TTGCCTTCCG 


CAGCAAATGC TAACACTGCA GGAACAGAAT 


CGATCTGCAC 


AAGACTCACA 


420 
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10 



15 



30 



35 



40 



45 



50 



960 
976 



AGACAAATGG CGTCAAGGGT TTTGATCTTC TACATCAGAC ATGCATCCCT TGTGCGACCA 480 

CTTTCAGAAT GGGGAAAACT CAGAATGGCC AAAGACATGG CCGAGCTGGA ACTAGCAGTG 540 

GGACAGAATC TATTTCCCGT GGAACAACTC GGAGCACCGT ACAGAGCTCT TAGAGCGTTT 600 

AGGCCTTTGG TTTTCCTGGA AACATCTCAA ATGGGATCAT CTCCTCTCAT CAATGATCTA 660 

CCACCGAGCA TCGTCCTACA TCATCTCTAC ACAAGAGGCC CAGACGAGTT AGAGTCACCG 720 

ATGCAGAAGA ACAGACTAAG TCCTAAACAG TACTCACTGT GGCTTGATAA CCAAAGAGAG 780 

GATCAGATCT GGAAAGGGAT AAAAGCAACT TTGGATGATT ATGCAGTGAA GATCAGATCG 840 

AGAGGGGACA AAGAGTTTAG TCCAGGTTAT CCTCTAATGC TTCAAATTGG TTCATCTTTA 900 
ACACAAGAAA ACTTATAAGC TGTGCTTTGT TACCGAATCA ATATTCTTCT ATTGCGAACT 
TTTTTGTCTC AAAAAA 

20 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 305 

(B) TYPE: amino acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Thr Arg Gly Pro Glu Thr Arg Gin lie Ser Gly Pro Ala Thr Ser Thr 
15 10 15 

Gin He Arg Asn Phe Thr Leu Cys Gin His Leu Gin Gly lie His Thr 
20 25 30 

His He Ser Ser Met Val Ala Asp Leu Pro Ser He Ala Thr Asp Val 
35 40 45 

Leu Ser Pro Tyr Leu Ala Ala He Tyr Asn Ala Ala Cys Glu Pro Val 
50 55 60 

Thr Pro Leu Phe Lys Ala Met Arg Asp Lys Leu Glu Ser Cys He Leu 
65 70 75 80 

Gin He His Asp Gin Asn Phe Gly Ala Asp Asp Ala Asp Met Asp Asn 
85 90 95 

Asn Ala Ser Ser Tyr Met Glu Glu Leu Gin Arg Ser He Leu His Phe 
100 105 HO 

Ara Lvs Glu Phe Leu Ser Arg Leu Leu Pro Ser Ala Ala Asn Ala ABn 
115 120 125 

Thr Ala Gly Thr Glu Ser He Cys Thr Arg Leu Thr Arg Gin Met Ala 
130 135 140 

Ser Arg Val Leu He Phe Tyr He Arg His Ala Ser Leu Val Arg Pro 
145 150 155 160 



55 
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Leu Ser Clu Trp Gly Lye Leu Arg Met Ala Ly Asp Met Ala Glu Leu 



165 170 



Glu Leu Ala Val Gly Gin Aan Leu Phe Pro Val Glu Gin Leu Gly Ala 
180 185 190 

Pro Tyr Arg Ala Leu Arg Ala Phe Arg Pro Leu Val Phe Leu Glu Thr 
1 195 200 205 

Ser Gin Met Gly Ser Ser Pro Leu lie Aan Asp Leu Pro Pro Ser lie 
210 215 220 

Val Leu Hie Hie Leu Tyr Thr Arg Gly Pro Asp Glu Leu Glu Ser Pro 
225 230 235 

Met Gin Lys Aen Arg Leu Ser Pro Lye Gin Tyr Ser Leu Trp Leu Asp 

245 250 

Asn Gin Arg Glu Asp Gin lie Trp Lys Gly lie Lys^la Thr Leu Asp 
260 265 

Asp Tyr Ala Val Lys lie Arg Ser Arg Gly Asp Lys Glu Phe Ser Pro 
275 280 285 

Gly Tyr Pro Leu Met Leu Gin He Gly Ser Ser Leu Thr Gin Glu Asn 
1 290 295 300 

305 

Leu 
305 



Claims 

1. Recombinant AFT1 polypeptide. 

2. A recombinant polypeptide comprising an amino acid sequence substantially identical to the amino acid sequence 
of AFT1 polypeptide shown in Fig. 1 (SEQ ID NO:2). 

3. A recombinant polypeptide which is a fragment or analog of an AFT1 polypeptide comprising a domain capable of 
activating transcription. 

4. The polypeptide of claim 3, wherein said polypeptide is AFT1 (34-248) or AFT1 (1 22-248). 

5. The polypeptide of claim 1 , 2 or 3, wherein said polypeptide is derived from a plant. 

6. The polypeptide of claim 5, wherein said plant is a crucifer. 

7. The polypeptide of claim 6. wherein said plant is Arabidopsis. 

8. A chimeric AFT1 transcriptional activator protein comprising an AFT1 polypeptide fused to a DNA-binding polypep- 
tide. 

9. The chimeric AFT1 transcriptional activator protein of claim 8, wherein said DNA-binding polypeptide comprises 
Gal4 or LexA 

10. A transgenic plant containing a transgene comprising an AFT1 polypeptide operably linked to a constitutive or reg- 
ulated promoter. _ 

1 1 . A transgenic plant containing a transgene comprising a chimeric AFT1 of claim 8. 
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12. A seed from a transgenic plant of claim 1 0 or 1 1 . 

1 3. A cell from a transgenic plant of claim 1 0 or 1 1 . 

5 14. A transgenic plant expressing a polypeptide of interest comprising: 

• (a) a nucleic acid sequence encoding the chimeric AFT1 transcriptional activator protein of claim 8; and 
(b) a nucleic acid encoding said polypeptide of interest in an expressible genetic construction, wherein the 
binding of said chimeric protein regulates the expression of said polypeptide of interest. 

10 

1 5. The polypeptide of claim 1 4, wherein said polypeptide comprises a plant storage protein gene. 

16. Substantially pure DNA encoding an AFT1 protein. 

15 1 7. Substantially pure DNA encoding a recombinant polypeptide comprising an amino acid sequence substantially iden- 
tical to the amino acid sequence of AFT1 polypeptide shown in Fig. 1 (SEQ ID NO:1). 

18. The DNA of claims 16 and 17, wherein said DNA is operably linked to aconstitutive or regulated promoter. 
20 19. The DNA of claim 18, wherein said DNA is cDNA. 

20. The DNA of claim 18, wherein said DNA is of the genus Arabidopsis. 

21 . A vector comprising the DNA substantially pure DNA encoding an AFT1 protein, said vector being capable of direct- 
25 ing expression of the protein encoded by said DNA in a vector-containing cell. 

22. A cell which contains the DNA of claim 16, claim 1 7, or the vector of claim 21 . 

23. The cell of claim 22, said cell being a plant cell. 

30 

24. A transgenic plant which contains the substantially pure DNA encoding an AFT1 protein. 

25. A transgenic plant containing the substantially pure DNA encoding a recombinant polypeptide comprising an amino 
acid sequence substantially identical to the amino acid sequence of AFT1 polypeptide shown in Fig. 1 (SEQ ID 

35 NO:1). 

26. A seed from a transgenic plant of claim 24 or claim 25. 

27. A cell from a transgenic plant of claim 24 or claim 25. 

40 

28. A recombinant polypeptide which is a fragment or analog of an AFT1 polypeptide comprising a domain capable of 
interacting with a plant defense related protein. 

29. The polypeptide of claim 28, wherein said polypeptide is AFT1 (33-194). 

45 

30. Substantially pure DNA encoding an AFT1 polypeptide fragment or analog of claim 28. 

31 . The DNA of claim 30, wherein said DNA is substantially identical to the DNA sequence shown in SEQ ID NO: 1 . 
50 32. The DNA of claim 31 , wherein said DNA is operably linked to a constitutive or regulated promoter. 



55 
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(SEQIDNO:1) 

1 AAAAAAAAATCAAATCTCTCTCTTTCTCTCTCTAATGGCGGCGACATTAGGCAGAGACCA 

MAATLGRDQ 9 

(SEQ ID NO: 2) 

6 1 GTATGTGTACATGGCGAAGCTCGCCGAGCAGGCGGAGCGTTACGAAGAGATGGTTCAATT 

YVYMAKLAEQABRYEEMVQF 29 

49 



661 AGCTGAGCTTGACACTCTGGGAGAGGAATCCTACAAAGACAGCACTCTCATAATGCAGTT 
AELDTLGEESYKDSTLIMQL 



69 



121 CATGGAACAGCTCGTTACAGGCGCTACTCCAGCGGAAGAGCTCACCGTTGAAGAGAGGAA 
MEQLVTGATPAEBLTVBERN 

181 TCTCCTCTCTGTTGCTTACAAGAACGTGATCGGATCTCTACGCGCCGCCTGGAGGATCGT 
LLSVAYKNVIGSLRAAWRIV 

241 GTCTTCGATTGAGCAGAAGGAAGAGAGTAGGAAGAACGACGAGCACGTGTCGCTTGTCAA 

SSIEQKEESRKNDEHVSLVK 89 

301 GGATTACAGATCTAAAGTTGAGTCTGAGCTTTCTTCTGTTTGCTCTGGAATCCTTAAGCT 

DYRSKVESELSSVCSGILKL 109 

361 CCTTGACTCGCATCTGATCCCATCTGCTGGAGCGAGTGAGTCTAAGGTCTTTTACTTGAA 

LDSHLIPSAGASESKVFY LK 129 

421 GATGAAAGGTGATTATCATCGGTACATGGCTGAGTTTAAGTCTGGTGATGAGAGGAAAAC 

MK GDYHRYMAE FKSGDERKT 149 

481 TGCTGCTGAAGATACCATGCTCGCTTACAAAGCAGCTCAGGATATCGCAGCTGCGGATAT 
AAEDTMLAYKAAQDIAAADM 



169 



189 



541 GGCACCTACTCATCCGATAAGGCTTGGTCTGGCCCTGAATTTCTCAGTGTTCTACTATGA 
APTHPIRLGLALNFSVFYYE 

601 GATTCTCAATTCTTCAGACAAAGCTTGTAACATGGCCAAACAGGCTTTTGAGGAGGCCAT 

ILNSSDKACNMAKQAFEEAI 209 



229 



721 GCTGAGGGACAATTTAACCCTTTGGACCTCCGATATGCAGGAGCAGATGGACGAGGCCTG 

LRDNLTLWT SDMQB QMDEA 248 

781 AGGATCTAGATGAAGGGGGGGAGGGTTGTTACGCGATGTTTCTGCCACCAAATCGATCTC 
841 AAAAT 
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B42/AFT1 
Derivatives 

B42/1 - 248 

B42/1-121 

B42/34 - 248 

B42/122-248 

B42/34- 194 

B42 alone 

LexA/AFT1 
Derivatives 

LexA/1 - 248 

LexA/1 - 194 

LexA/1 -121 

LexA/34-248 

LexA/122-248 

LexA alone 



Growth p-Galactosidase 



1 248 

I I + 10.9 

1 121 

I I " 1-7 

34 248 

I I + 21.2 

122 248 

I I + 15.3 



34 194 
I I 



1.7 

~F£H13A 



Growth p-Galactosidase 



1 248 

I I + 39.2 

1 194 

I I - 0.7 

1 121 

I 1 - 0.6 

34 248 

l 1 + 9.3 

122 248 

I I - 1.2 



0.8 

§tin . 3B 
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i AFT1 
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1 


TCACCCAGAG AGGTCAGGCT TTGATGGACC ATGGACCCAA 


GAGCCGCTGA 


51 


AGTTTGACAA 


CTCCTACTTC GTGGAACTGC TGAAAGGAGA ATCAGAGGGC 


101 


TTGTTGAAAC 


TTCCAACTGA CAAGACCTTA TTGGAAGACC 


CGGAGTTCCG 


151 


TCGTCTTGTT 


GAGCTTTATG CAAAGGATGA AGATGCATTC 


TTCAGAGACT 


201 


ACGCGGAATC 


GCACAAGAAA CTCTCTGAGC TTGGTTTCAA 


CCCAAACTCC 


251 


TCAGCAGGCA 


AAGCAGTTGC AGACAGCACG ATTCTGGCAC 


AGAGTGCGTT 


301 


CGGGGTTGCA 


GTTGCTGCTG CGGTTGTGGC ATTTGGTTAC 


TTTTACGAGA 


351 


TTCGGAAGAG 


GATGAAGTAA ACGAAATAGG AAGGAAAACA 


CGAAGCAACG 


401 


ATGCTCTTAT 


TTGGGTATTA AAGAAACTAT TAATCGTCTA 


TCGAATCTAT 


451 


TTTGCTGCTA 


CAAGATTCTA AACTCTTTGA ATCCACGATT 


CCACTGTTTA 


501 


GTAGTAAAAA 


AGTTAAAAAG TCAATATTTT GGGTCCGTGA 


TTCATTTTTG 


551 


CGATAAA 







(SEQ ID NO: 17) 
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1 HPERSGFDGP WTQEPLKFDN SYFVELLKGE SEGLLKLPTD KTLLEDPEFR 
51 RLVELYAKDE DAFFRDYAES HKKLSELGFN PNSSAGKAVA DSTILAQSAF 
101 GVAVAAAWA FGYFYEIRKR MK* 

(SEQ ID NO: 18) 

mm m Of 
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1 GAGTGACGAA CATTGCGTGA AATTCTTGAA GAAGTGCTAC GAGTCACTTC 

51 CAGAGGATGG AAAAGTGATA TTAGCAGAGT GTATTCTTCC AGAGACACCA 

101 GACTCAAGCC TCTCAACCAA ACAAGTAGTC CATGTCGATT GCATTATGTT 

151 GGCTCACAAT CCCGGAGGCA AAGAACGAAC CGAGAAAGAG TTTGAGGCAT 

201 TAGCCAAAGC ATCAGGCTTC AAGGGCATCA AAGTTGTCTG CGACGCTTTT 

251 GGTGTTAACC TTATTGAGTT ACTCAAGAAG CTCTAAAAAC AAACAATGTT 

301 CCTATGAAGA TGATTTATAT GTAAACATTA TCTCATATCT CCTTCCACGG 

351 TTCCAAAACT ATGCTGTTTA ATAATGGTTT TTACAAGAAT TTGATTATGA 

401 GTTTGTATTT TTGTTTGTTT GGAACAAAAT TATGTGATTA TAGGGAAAAA 

451 TAAAATGAGC TATTATTGAA GAAAAAAA 

(SEQ ID NO: 19) 
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1 SDEHCVKFLK NCYESLPBDG KVILAECILP ETPDSSLSTK QWHVDCIML 
51 AHNP6GKERT EKEFEALAKA SGFKGIKWC DAFGVNLIEL LKKL* 

(SEQ ID NO: 20) 

~nnr .9 
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1 CCAGATTATC CCTCCCCCGA ATTCGGCACG AGGAAAAATC CTCTTCTTTC 

51 AGATGAGAAA CCCAAATCGA CGGAGGAGAA TAAGAGTTCT AAGCCGGAAT 

101 CAGCTTCTGG GAGTTCAACT TCATCAGCTA TGCCTGGCTT GAATTTCAAT 

151 GCTTTTGATT TCTCTAATAT GGCTAGTATT CTCAACGATC CTAGCATCAG 

201 AGAAATGGCT GAGCAAATAG CTAAAGATCC TGCCTTTAAC CAATTGGCTG 

251 AGCAGCTTCA GAGATCTATT CCTAACGCTG GCCAGGAAGG TGGTTTCCCT 

301 AACTTTGATC CTCAACAGTA TGTCAATACA ATGCAACAGG TTATGCATAA. 

351 CCCTGAGTTT AAGACAATGG CCGAGAAACT TGGTACCGCC TTAGTTCAGG 

401 ATCCACAAAT GTCTCCTTTT TTGGATGCTT TCTCGAATCC TGAAACAGCA 

451 GAACACTTTA CTGAGCGTAT GGCGCGGATG AAAGAAGATC CAGAGTTGAA 

501 ACCTATACTA GATGAGATTG ATGCTGGTGG TCCTTCTGCC ATGATGAAGT 

551 ACTGGAATGA TCCAGAAGTG CTGAAAAAGC TGGGTGAAGC AATGGGTATG 

601 CCTGTTGCTG GCTTACCAGA CCAGACTGTT TCAGCTGAAC CTGAGGTAGC 

651 AGAAGAAGGT GAAGAAGAAG AGTCTATTGT TCACCAAACT GCCAGTCTTG 

701 GTGATGTTGA GGGTTTGAAA GCTGCCTTGG CATCTGGTGG TAACAAAGAT 

751 GAAGAAGATT CTGAAGGAAG GACAGCATTG CATTTTGCTT GTGGATACGG 

801 CGAGTTGAAA TGTGCTCAAG TTCTTATCGA TGCTGGAGCA AGTGTTAATG 

851 CGGTTGACAA AAACAAGAAC ACACCTCTGC ATTATGCTGC TGGTTACGGG 

901 AGGAAAGAGA GTGTAAGCCT TCTCCTGGAG AATGGTGCTG CAGTCACTCT 

951 GCAAAACCTA GACGAGAAGA CGCCAATTGA TGTAGCGAAG CTCAACAGCC 

1001 AGCTGGAGGT GGTGAAGCTG CTTGAGAAGG ATGCTTTCCT TTGAGCTCTG 

1051 CTGGTTAAAG GAAAGCTCTA AGCTCATATT GTCTTTGAGG CATTTGTCTT 

1101 GTGTGTGTCC TGAACCAGTT TCACAGGCTT TTTGTGTACA CTTTTTATTA 

1151 GTTCCTCTCT TCTTCTAAAT TTGTCTCTTA TGTTGTTTTA AAAGTCAATA 

1201 AAGAAAGAAA TAGCAATCAA TGATTTAATT TATGATTATA TTCTTTATTT 

1251 CGTCGACCTC TACAGAATGA TTCAATTTGG AAGAATCATT CTGGTTTGGA 

1301 GGATATGTAA GAAAAACTAC TTGATCTCCA AGTTATTCCA TTCTTCTGTT 

1351 GAAAAAA 

(SEQ ID NO: 21) - §iitj Z JtO 
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1 GTRKNPLLSD EKPKSTEENK SSKPESASGS STSSAMPGLN FNAFDFSNMA 

51 SILNDPSIRE MAEQIAKDPA FNQLAEQLQR SIPNAGQEGG FPNFDPQQYV 

101 NTMQQVMHNP EFKTMAEKLG TALVQDPQMS PFLDAFSNPE TAEHFTERMA 

151 RMKEDPELKP ILDEIDAGGP SAMMKYWNDP EVLKKLGEAM GMPVAGLPDQ 

201 TVSAEPEVAE EGEEEESIVH QTASLGDVEG LKAALASGGN KDEEDSEGRT 

251 ALHFACGYGE LKCAQVLIDA GASVNAVDKN KNTPLHYAAG YGRKESVSLL 

301 LENGAAVTLQ NLDEKTPIDV AKLNSQLEW KLLEKDAFL* 

(SEQ ID NO: 22) 
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1 TTTTAAAAAA TTTTGCCATC AACCGTAGAT GTTCCGCCAA AGGGTGGGTT 

51 TAGCTTCGAT CTGTGTAAGA GAAATGATAT TCTTACACAA AAGGGTCTTA 

101 AAGCTCCGTC TTTTTTGAAG ACTGGAACAA CCATTGTTGG TTTGATTTTC 

151 AAGGATGGTG TGATACAAGG GGCAGATACC CGAGCAACTG AGGGGCCAAT 

201 TGTTGCTGAT AAGAACTGTG AGAAGATTCA CTATATGGCA CCAAACATAT 

251 ATTGCTGTGG TGCAGGAACT CGGGCTGATA CTGAAGCAGT CACTGATATG 

301 GTCAGCTCAC AGCTGCGATT GGATCGTTAC CAGACTGGTC GAGACTCTCG 

351 GGTCATTACT GCTTTGACCC TTCTCAAAAA ACATTTTTTC AGCTACCAAG 

401 GTCATGTCTC TGCTGCTCTT GTACTCGGTG GAGTTGATAT CACTGGTCCA 

451 CATCTGCATA CTATATACCC ACACGGTTCA ACTGACACTC TTCCATTCGC 

501 CACAATGGGT TCGGGTTCTC TTGCTGCTAT GTCTGTGTTT GAGGCAAAGT 

551 ATAAAGAAGG CCTAACTAGG GATGAAGGAA TTAAGCTGGT CGCTGAATCC 

601 ATATGCTCGG GTATATCCAA TGACCTGGGT AGTGGTAGCA ACGTGGACAT 

651 CTGCGTGATC AC A 

(SEQ ID NO: 23) 
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KILPSTVD VPPKGGFSFD LCKKNDILTQ KGLKAPSFLK TGTTIVGLIF 
KDGVIQGADT RATEGPIVAD KNCEKIHYMA PNIYCCGAGT RADTEAVTDM 
VSSQLRLHRY QTGRDSRVIT ALTLLKKHFF SYQGHVSAAL VLGGVDITGP 
HLHTIYPHGS TDTLPFATMG SGSLAAMSVF EAKYKEGLTR DEGIKLVAES 
ICSGISNDLG SGSNVDICVI T 

(SEQ ID NO: 24) 
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1 ACGAGAGGCC CTGAGACGCG GCAGATATCA GGTCCTGCGA CTTCAACACA 

51 GATCAGGAAC TTCACATTAT GTCAGCATCT GCAAGGAATC CACACACATA 

101 TCTCATCCAT GGTAGCGGAC CTTCCCAGTA TTGCTACTGA TGTATTGTCT 

151 CCTTATCTGG CTGCAATCTA TAATGCGGCA TGTGAGCCAG TTACACCTTT 

201 GTTTAAAGCA ATGCGAGACA AGCTCGAGTC ATGCATTCTT CAAATCCATG 

251 ATCAAAACTT TGGTGCTGAT GACGCTGACA TGGACAACAA CGCTTCCTCA 

301 TACATGGAGG AGTTGCAGAG ATCGATTCTT CACTTCCGCA AGGAGTTCCT 

351 ATCTAGACTA TTGCCTTCCG CAGCAAATGC TAACACTGCA GGAACAGAAT 

401 CGATCTGCAC AAGACTCACA AGACAAATGG CGTCAAGGGT TTTGATCTTC 

451 TACATCAGAC ATGCATCCCT TGTGCGACCA CTTTCAGAAT GGGGAAAACT 

501 CAGAATGGCC AAAGACATGG CCGAGCTGGA ACTAGCAGTG GGACAGAATC 

551 TATTTCCCGT GGAACAACTC GGAGCACCGT ACAGAGCTCT TAGAGCGTTT 

601 AGGCCTTTGG TTTTCCTGGA AACATCTCAA ATGGGATCAT CTCCTCTCAT 

651 CAATGATCTA CCACCGAGCA TCGTCCTACA TCATCTCTAC ACAAGAGGCC 

701 CAGACGAGTT AGAGTCACCG ATGCAGAAGA ACAGACTAAG TCCTAAACAG 

751 TACTCACTGT GGCTTGATAA CCAAAGAGAG GATCAGATCT GGAAAGGGAT 

801 AAAAGCAACT TTGGATGATT ATGCAGTGAA GATCAGATCG AGAGGGGACA 

851 AAGAGTTTAG TCCAGGTTAT CCTCTAATGC TTCAAATTGG TTCATCTTTA 

901 ACACAAGAAA ACTTATAAGC TGTGCTTTGT TACCGAATCA ATATTCTTCT 

951 ATTGCGAACT TTTTTGTCTC AAAAAA 

(SEQ ID NO: 25) 



43 



EPO 693 554 A1 



1 


TRGPETRQIS GPATSTQIRN FTLCQHLQGI 


HTHISSMVAD LPSIATDVLS 


51 


PYLAAIYNAA CEPVTPLFKA 


MRDKLESCIL 


QIHDQNFGAD DADMDNNASS 


101 


YMEELQRSIL HFRKEFLSRL 


LPSAANANTA 


GTESICTRLT RQMASRVLIF 


151 


YIRHASLVRP LSEWGKLRMA 


KDMAELELAV 


GQNLFPVEQL GAPYRALRAF 


201 


RPLVFLETSQ MGSSPLINDL 


PPSIVLHHLY 


TRGPDELESP MQKNRLSPKQ 


251 


YSIiWLDNQRE DQIWKGIKAT 


LDDYAVKIRS 


RGDKEFSPGY PLMLQIGSSL 


301 


TQENL* 







(SEQ ID NO: 26) 
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